You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-`instanceName` (symbolic name) - respective name of the data instance
59
139
60
-
### Remove coordinator instance
140
+
**Implications:**
141
+
When unregistering an instance, ensure that the instance being unregistered is
142
+
**not** the MAIN instance. Unregistering MAIN can lead to an inconsistent
143
+
cluster state. Additionally, the cluster must have an **alive** MAIN instance
144
+
during the unregistration process. If no MAIN instance is available, the
145
+
operation cannot be guaranteed to succeed.
61
146
62
-
If during cluster setup or at some later stage of cluster life, the user decides to remove some coordinator instance, `REMOVE COORDINATOR` query can be used.
63
-
Only on leader can this query be executed in order to remove followers. Current cluster's leader cannot be removed since this is prohibited
64
-
by NuRaft. In order to remove the current leader, you first need to trigger leadership change.
147
+
The instance requested to be unregistered will also be unregistered from the current MAIN's replica set.
65
148
66
-
```plaintext
67
-
REMOVE COORDINATOR <COORDINATOR-ID>;
149
+
**Example:**
150
+
```cypher
151
+
UNREGISTER INSTANCE instance_1;
68
152
```
69
153
154
+
## Replication role management queries
70
155
71
-
### Set instance to main
156
+
### SET INSTANCE TO MAIN
72
157
73
-
Once all data instances are registered, one data instance should be promoted to main. This can be achieved by using the following query:
158
+
Once all data instances are registered, one data instance should be promoted to main.
159
+
This can be achieved by using the following query:
74
160
75
-
```plaintext
76
-
SET INSTANCE instanceName to main;
161
+
```cypher
162
+
SET INSTANCE instanceName TO MAIN;
77
163
```
78
164
79
-
This query will register all other instances as replicas to the new main. If one of the instances is unavailable, setting the instance to main will not succeed.
80
-
If there is already a main instance in the cluster, this query will fail.
165
+
**Parameters:**
166
+
-`instanceName` (symbolic name) - name of the data instance that is going to be promoted to main
81
167
168
+
**Behaviour:**
169
+
This query will register all other instances as replicas to the new main.
82
170
This operation will result in writing to the Raft log.
83
171
84
-
### Demote instance
172
+
**Implications:
173
+
If one of the instances is unavailable, setting the instance to MAIN will not succeed.
174
+
If there is already a MAIN instance in the cluster, this query will fail.
85
175
86
-
Demote instance query can be used by an admin to demote the current main to replica. In this case, the leader coordinator won't perform a failover, but as a user,
87
-
you should choose promote one of the data instances to main using the `SET INSTANCE `instance` TO main` query.
176
+
**Example:**
177
+
```cypher
178
+
SET INSTANCE instance_0 TO MAIN;
179
+
```
180
+
181
+
### DEMOTE INSTANCE
88
182
89
-
```plaintext
183
+
Demote instance query can be used by an admin to demote the current MAIN to REPLICA.
184
+
185
+
```cypher
90
186
DEMOTE INSTANCE instanceName;
91
187
```
92
188
93
-
This operation will result in writing to the Raft log.
189
+
**Behaviour:**
190
+
- MAIN is demoted to REPLICA
191
+
- This operation will result in writing to the Raft log.
94
192
95
-
<Callouttype="info">
193
+
**Implications:**
194
+
- In this case, the leader coordinator won't perform a failover, but as a user, you should choose promote one of
195
+
the data instances to main using the `SET INSTANCE `instance` TO main` query.
96
196
197
+
<Callouttype="info">
97
198
By combining the functionalities of queries `DEMOTE INSTANCE instanceName` and `SET INSTANCE instanceName TO main` you get the manual failover capability. This can be useful
98
199
e.g during a maintenance work on the instance where the current main is deployed.
99
-
100
200
</Callout>
101
201
102
-
103
-
### Unregister instance
104
-
105
-
There are various reasons which could lead to the decision that an instance needs to be removed from the cluster. The hardware can be broken,
106
-
network communication could be set up incorrectly, etc. The user can remove the instance from the cluster using the following query:
107
-
108
-
```plaintext
109
-
UNREGISTER INSTANCE instanceName;
202
+
**Example:**
203
+
```cypher
204
+
DEMOTE INSTANCE instance1;
110
205
```
111
206
112
-
When unregistering an instance, ensure that the instance being unregistered is
113
-
**not** the main instance. Unregistering main can lead to an inconsistent
114
-
cluster state. Additionally, the cluster must have an **alive** main instance
115
-
during the unregistration process. If no main instance is available, the
116
-
operation cannot be guaranteed to succeed.
117
-
118
-
The instance requested to be unregistered will also be unregistered from the current main's replica set.
207
+
## Monitoring commands
119
208
120
-
### Force reset cluster state
209
+
### SHOW INSTANCES
121
210
122
-
In case the cluster gets stuck there is an option to do the force reset of the cluster. You need to execute a command on the leader coordinator.
123
-
This command will result in the following actions:
124
-
125
-
1. The coordinator instance will demote each alive instance to replica.
126
-
2. From the alive instance it will choose a new main instance.
127
-
3. Instances that are down will be demoted to replicas once they come back up.
211
+
You can check the state of the whole cluster using the `SHOW INSTANCES` query.
128
212
129
-
```plaintext
130
-
FORCE RESET CLUSTER STATE;
213
+
```cypher
214
+
SHOW INSTANCES;
131
215
```
132
216
133
-
This operation will result in writing to the Raft log.
134
-
135
-
### Show instances
136
-
137
-
You can check the state of the whole cluster using the `SHOW INSTANCES` query. The query will display all the Memgraph servers visible in the cluster. With
217
+
**Behaviour:**
218
+
The query will display all the Memgraph servers visible in the cluster. With
138
219
each server you can see the following information:
139
220
1. Network endpoints they are using for managing cluster state
140
221
2. Health state of server
141
222
3. Role - main, replica, LEADER, FOLLOWER or unknown if not alive
142
223
4. The time passed since the last response time to the leader's health ping
143
224
144
-
This query can be run on either the leader or followers. Since only the leader knows the exact status of the health state and last response time,
145
-
followers will execute actions in this exact order:
225
+
**Implications:**
226
+
This query can be run on either the leader or followers. Since only the leader knows the exact status of the health state
227
+
and last response time, followers will execute actions in this exact order:
146
228
1. Try contacting the leader to get the health state of the cluster, since the leader has all the information.
147
229
If the leader responds, the follower will return the result as if the `SHOW INSTANCES` query was run on the leader.
148
230
2. When the leader doesn't respond or currently there is no leader, the follower will return all the Memgraph servers
149
231
with the health state set to "down".
150
232
151
-
```plaintext
152
-
SHOW INSTANCES;
153
-
```
154
-
155
233
156
-
### Show instance
234
+
### SHOW INSTANCE
157
235
158
236
You can check the state of the current coordinator to which you are connected by running the following query:
159
237
160
-
```plaintext
238
+
```cypher
161
239
SHOW INSTANCE;
162
240
```
163
241
242
+
**Behaviour:**
164
243
This query will return the information about:
165
244
1. instance name
166
245
2. external bolt server to which you can connect using Memgraph clients
167
246
3. coordinator server over which Raft communication is done
168
247
4. management server which is also used for inter-coordinators communication and
169
248
5. cluster role: whether the coordinator is currently a leader of the follower.
170
249
250
+
**Implications:**
171
251
If the query `ADD COORDINATOR` wasn't run for the current instance, the value of the bolt server will be "".
172
252
173
-
### Show replication lag
253
+
### SHOW REPLICATION LAG
174
254
175
-
The user can find the current replication lag on each instance by running `SHOW REPLICATION LAG` on the cluster's leader. The replication lag is expressed with
176
-
the number of committed transactions. Such an info is made durable through snapshots and WALs so restarts won't cause the information loss. The information
177
-
about the replication lag can be useful when manually performing a failover to check whether there is a risk of a data loss.
255
+
The user can find the current replication lag on each instance by running `SHOW REPLICATION LAG` on the cluster's leader.
256
+
The replication lag is expressed with the number of committed transactions.
178
257
179
-
```plaintext
258
+
```cypher
180
259
SHOW REPLICATION LAG;
181
260
```
182
261
262
+
**Implications:**
263
+
- Such an info is made durable through snapshots and WALs so restarts won't cause the information loss.
264
+
- The information about the replication lag can be useful when manually performing a failover to check whether there is a
265
+
risk of a data loss.
266
+
267
+
## Troubleshooting commands
268
+
269
+
### FORCE RESET CLUSTER STATE
270
+
271
+
In case the cluster can't get into a healthy state, or any unexpected event occurs, there is an option to do the force
272
+
reset of the cluster.
273
+
274
+
```cypher
275
+
FORCE RESET CLUSTER STATE;
276
+
```
277
+
278
+
**Behaviour:**
279
+
1. The coordinator instance will demote each alive instance to replica.
280
+
2. From the alive instance it will choose a new main instance.
281
+
3. Instances that are down will be demoted to replicas once they come back up.
282
+
283
+
This operation will result in writing to the Raft log.
284
+
285
+
**Implications:**
286
+
You need to execute a command on the leader coordinator.
0 commit comments