From a59d25fb9eb0f0b9b7ef8183ad813c214c2a755c Mon Sep 17 00:00:00 2001 From: Gabriel Ciciliani Date: Sat, 16 May 2026 10:27:14 -0300 Subject: [PATCH 01/10] Documentation to enable Automatic schema changes --- docs/releem-agent/automatic-schema-changes.md | 108 ++++++++++++++++++ 1 file changed, 108 insertions(+) create mode 100644 docs/releem-agent/automatic-schema-changes.md diff --git a/docs/releem-agent/automatic-schema-changes.md b/docs/releem-agent/automatic-schema-changes.md new file mode 100644 index 0000000..a40a330 --- /dev/null +++ b/docs/releem-agent/automatic-schema-changes.md @@ -0,0 +1,108 @@ +--- +id: automatic-schema-changes +title: Automatic Schema Changes +--- + +# Automatic schema changes in the Releem Agent + +If the **Releem Agent** is already installed and running, you can allow it to execute approved schema changes on the server. Automatic schema changes also include the option of running a pre-change backup, in case a rollback is required. + +Both automatic schema changes and backups were implemented with availability in mind, so they will only run if: +* There is enough disk space to perform both, the backup and the schema change +* The backup won't block the affected tables +* Point-in-time restore is possible on the server +* The schema change won't block the affected tables + +The following steps explain how to configure the agent and the database user to handle this new functionality. + +--- + +## 1. Locate the configuration file + +To enable automatic schema changes, we need to include a few new parameters in the agent configuration file. Below is the default location for Linux servers. Open the file with your favorite editor to add the new parameters. + +| Platform | Default path | +|----------|----------------| +| Linux | `/opt/releem/releem.conf` | + +--- +## 2. Enable automatic schema (DDL) execution + +By default the agent **does not** run schema changes from Releem, even when you approve them in the product. For schema changes to be executed on your database server, activate this feature explicitly by setting `enable_exec_ddl` to `true`. + +Before running the schema change against the real table, the agent will perform a dry-run of the change against an empty table with the same structure. This is to guarantee that the operation can run successfully with the intenteded strategy. + +There are some schema changes that the database server can't execute on its own, without blocking the table. An alternative it to use an external tool called [pt-online-schema-change](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html). This tool creates a copy of the table with the intended changes, copies all data to this new table, and swaps it with the existing one, with minimum impact. + +[pt-online-schema-change](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html) needs to be available on the server and the location of the tool can be specified in the configuration. + +| Setting | Values | What it does | +|---------|--------|----------------| +| `enable_exec_ddl` | `false` (default) or `true` | When `true`, the agent may execute **schema changes** that Releem sends after analysis. When `false`, those changes are not run; the agent reports that execution is disabled. | +| `ptosc_path` | `pt-online-schema-change` | Percona Toolkit is not on `PATH` or you use a non-standard binary location. | +| `online_ddl_test_schema` | `releem_online_ddl_test` (default) or any valid database/schema name | **Optional:** Database/schema name where the agent will test the schema change before executing it against the real table| + + +--- +## 3. Configure your backup settings + +When a pre-change backup is requested, the agent needs tools and extra disk space available on the **same host that runs the agent**. As mentioned before, the Releem agent will look for the best alternative to backup the affected tables before the schema change is executed. + +* If the server and the table supports it, the agent will create a physical backup of the table using `xtrabackup` or `mariabackup` +* If online physical backup is not an option, the agent will use mysqldump to create a logical backup of the data (a `.sql` file with necessary statements to re-create the table and the data) + +Releem only proceeds with the backup when **point-in-time recovery** is available for the instance as Releem detects it. If not, the change that required the backup will not run. + + +| Setting | Values | What it does | +|---------|--------|----------------| +| `backup_dir` | `/tmp/backups` (default) | Directory for backup output. Must exist or be creatable and have enough free space. | +| `mysqldump_path` | `mysqldump` (default) | Full path or name on `PATH` for `mysqldump` (logical backup). | +| `xtrabackup_path` | `xtrabackup` (default) | Full path or name on `PATH` for `xtrabackup` (physical backup when Releem selects that method). | +| `backup_space_buffer` | `20.0` (default) | Extra free space (as a percentage) the agent requires above its estimated backup size before starting a backup. | + + +--- +## 4. Extend database user permissions + +The same **MySQL user** the agent already uses for monitoring must have permission to run the approved ALTER statements. Connect to the target database server and run the he GRANT statements below: + +```sql +-- To allow table ALTERs and New indexes on **any** database +GRANT CREATE, REFERENCES, INDEX, ALTER ON *.* TO `releem`@`127.0.0.1` +``` + +```sql +-- Alternative: grant ALTER permissions *only* on a specific database +GRANT CREATE, REFERENCES, INDEX, ALTER ON `airportdb`.* TO `releem`@`127.0.0.1` +``` + +```sql +-- Needed for schema changes dry-runs (note this only affects the test database) +GRANT CREATE, DROP, INDEX, ALTER ON `releem_online_ddl_test`.* TO `releem`@`127.0.0.1` +``` + +#### Optional - To use pt-online-schema-change as an alternative method when the operation can't be executed online by the server +```sql +GRANT SELECT, INSERT, DROP, RELOAD, SUPER, SHOW VIEW, TRIGGER ON *.* TO `releem`@`127.0.0.1` +``` + +--- + + + +## 5. Restart the agent + + +After editing, **restart the Releem Agent** so changes take effect. + +--- + +## External tools + +Install **mysqldump**, **XtraBackup**, **mariabackup** and **pt-online-schema-change**as appropriate for your Database server and OS flavor. For more information about how to install these tools, please refer to: + +* [pt-online-schema-change](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html) +* [xtrabackup](https://docs.percona.com/percona-xtrabackup/2.4/index.html) +* [mariabackup](https://mariadb.com/docs/server/server-usage/backup-and-restore/mariadb-backup/mariadb-backup-overview#installing-mariadb-backup) +* [mysqldump](https://dev.mysql.com/doc/refman/9.7/en/mysqldump.html) From d40a25d36ea737eade3b8384a71edda328c9dc05 Mon Sep 17 00:00:00 2001 From: Gabriel Ciciliani Date: Sat, 16 May 2026 11:26:31 -0300 Subject: [PATCH 02/10] Automatic schema change troubleshooting guide --- .../schema-change-troubleshooting.md | 141 ++++++++++++++++++ 1 file changed, 141 insertions(+) create mode 100644 docs/releem-agent/schema-change-troubleshooting.md diff --git a/docs/releem-agent/schema-change-troubleshooting.md b/docs/releem-agent/schema-change-troubleshooting.md new file mode 100644 index 0000000..2945afa --- /dev/null +++ b/docs/releem-agent/schema-change-troubleshooting.md @@ -0,0 +1,141 @@ +# Schema change troubleshooting guide + +This guide covers failure scenarios for **automatic schema changes** executed by the Releem Agent. When a change fails, check the task output in the Releem portal and match the **exit code** and message to the table below. + +Exit codes are reported as `task_exit_code` on the task status sent back to Releem. A task with **status 4** failed; **status 1** with exit code **0** succeeded. + +For configuration prerequisites, see [user-guide-task-automation.md](./user-guide-task-automation.md). + +--- + +## Exit codes set before execution starts + + +| Scenario | Exit code | Troubleshooting steps | +| --------------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Schema change execution disabled | **10** | Set `enable_exec_ddl = true` in `/opt/releem/releem.conf` (or your config path), restart the agent, and retry the change from Releem. | +| Invalid or malformed task payload | **2** | This is not fixable on the server alone—the task JSON from Releem is invalid or missing required fields (`schema_name`, `ddl_statement`, `analysis_results.schema_name`, `analysis_results.table_name`). Contact Releem support with the task id; retry after the platform resends a valid payload. | +| Empty schema change list | **3** | The task contained no statements to run. Retry from Releem or contact support if the change should have been scheduled. | + + +--- + +## Exit codes set during validation (per statement) + +These stop the task before any DDL or backup runs on the server. + + +| Scenario | Exit code | Troubleshooting steps | +| ----------------------------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| DDL failed syntax validation | **4** | Fix the SQL in Releem (or cancel and recreate the change). The task output includes `syntax validation failed` and any `syntax_error` detail from analysis. Do not retry the same statement until the DDL is corrected. | +| No safe execution method | **5** | Releem analysis marked the change as neither online DDL nor `pt-online-schema-change` safe (`ok_online_ddl` and `ok_pt_osc` both false). Revise the change (smaller scope, different operation), use a maintenance window with manual DDL, or ask Releem why the statement was classified as blocking-only. | +| Pre-change backup required but PITR unavailable | **6** | The change requested a backup before DDL, but point-in-time recovery is not available on this instance (binlog/archiving, managed-service PITR, etc.). Enable PITR on the server or disable the pre-change backup requirement for this change in Releem if policy allows. | + + +--- + +## Exit code 7 — execution or backup failed + +All rows below use exit code **7**. The task output includes `Statement N failed:` followed by the underlying error. Enable `debug = true` in `releem.conf` and restart the agent for detailed command logs (passwords are masked). + +### Disk space and filesystem capacity + + +| Scenario | Exit code | Troubleshooting steps | +| -------------------------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Insufficient space on MySQL datadir | **7** | Message contains `insufficient datadir free space` (must stay **above 10%** free) or `insufficient datadir capacity` (projected use after change must stay **at or below 90%**). Free space on the datadir filesystem, archive or drop unused data, or shrink large tables before retrying. Only for emergencies: set `disable_space_checks = true` in `releem.conf` if your team accepts skipping these checks. | +| Insufficient space in backup directory | **7** | Message contains `insufficient disk space: required` under `backup_dir`. Free space on the volume that holds `backup_dir` (default `/tmp/backups`), point `backup_dir` to a larger filesystem, or lower `backup_space_buffer` only if you accept less safety margin. | +| Cannot read datadir or table size | **7** | Messages such as `failed to resolve datadir`, `datadir is empty`, `failed to get table size`, or `failed to check datadir filesystem capacity`. Verify the agent MySQL user can run `SHOW VARIABLES LIKE 'datadir'` and query `information_schema.TABLES` for the target schema and table. | +| Cannot check backup directory | **7** | `failed to check disk space` or `failed to create backup directory`. Ensure `backup_dir` exists, is writable by the agent process, and is on a filesystem the host can stat. | + + +### Pre-change backup + + +| Scenario | Exit code | Troubleshooting steps | +| ----------------------------------- | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| mysqldump backup failed | **7** | Message contains `backup failed` and `mysqldump failed`. Install `mysqldump`, set `mysqldump_path` if needed, confirm `mysql_host` / `mysql_user` / `mysql_password` in `releem.conf`, and ensure the user can dump the target table. Run the same mysqldump manually as the agent user to reproduce. | +| XtraBackup backup or prepare failed | **7** | Message contains `xtrabackup backup failed` or `xtrabackup prepare failed`. Install a compatible **xtrabackup** (or **mariabackup** if your deployment maps it via `xtrabackup_path`), fix `xtrabackup_path`, and verify backup user privileges. Review tool output in agent logs with `debug = true`. | +| Backup configuration missing | **7** | `mysql_host is required for backup` or `config is required for backup`. Set MySQL connection settings in `releem.conf` the same way as for normal agent monitoring. | +| Backup size estimate failed | **7** | `failed to estimate backup size`. Check that the target table exists and the agent user can read `information_schema.TABLES`. | + + +### Online DDL (including dry-run on test table) + + +| Scenario | Exit code | Troubleshooting steps | +| --------------------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Online DDL preflight (dry-run) failed | **7** | Message contains `online DDL preflight failed on test table`. The agent clones the table into `online_ddl_test_schema` (default `releem_online_ddl_test`) and runs the DDL there first. Grant `CREATE`, `DROP`, `INDEX`, `ALTER` on that schema; confirm the DDL is valid for an empty copy (same engine/structure). Fix incompatible DDL or use a change Releem routes to `pt-online-schema-change`. | +| Online DDL failed on production table | **7** | Message contains `schema change execution failed` after preflight succeeded. Often metadata locks, unsupported `ALGORITHM`/`LOCK`, or replication restrictions. Check MySQL error in agent logs; retry in a low-traffic window; resolve blocking sessions. The agent does **not** fall back to pt-osc after online DDL fails. | +| Test schema cannot be created | **7** | `test schema is required`, `failed to create test schema`, or `failed to create test table`. Set `online_ddl_test_schema` if the default name conflicts; grant DDL on that schema; ensure disk space for the empty clone. | +| DDL shape not supported for online path | **7** | `unsupported DDL for online clauses` or `could not locate target table in DDL statement`. Use `ALTER TABLE` or supported `CREATE INDEX` forms; ensure the statement references the analyzed `schema.table`. | +| Lock wait timeout | **7** | Online DDL sets `lock_wait_timeout = 20`. If errors mention lock wait or metadata locks, clear blocking transactions and retry, or use a maintenance window / pt-osc path if Releem allows it. | + + +### pt-online-schema-change + + +| Scenario | Exit code | Troubleshooting steps | +| ---------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| pt-osc dry-run failed | **7** | Message contains `pt-online-schema-change dry-run failed`. Install [Percona Toolkit](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html), set `ptosc_path`, grant privileges in the user guide (`SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, `TRIGGER` on `*.`* when required). Run `pt-online-schema-change --dry-run` manually with the same connection settings. | +| pt-osc execute failed | **7** | Dry-run passed but `pt-online-schema-change failed` on execute. Check pt-osc output in logs (triggers, replicas, disk, permissions). Resolve replica lag or tool errors before retrying. | +| pt-osc configuration missing | **7** | `mysql_host is required for pt-online-schema-change` or `config is required for pt-online-schema-change`. Complete MySQL settings in `releem.conf`. | + + +### Other execution errors (exit code 7) + + +| Scenario | Exit code | Troubleshooting steps | +| -------------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Missing table name | **7** | `table name is required for schema change execution`. Internal/task payload issue—contact Releem support with the task id. | +| Failed to parse table name | **7** | `failed to parse table name`. Ensure `analysis_results.schema_name` and `analysis_results.table_name` match the real object and use a valid `schema.table` form. | + + +--- + +## Exit code 8 — no statements executed + + +| Scenario | Exit code | Troubleshooting steps | +| -------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| No schema changes executed | **8** | Task output includes `No schema changes were executed.` This is returned when the loop finishes without applying any statement (unusual if earlier validation passed). Review full task output and agent logs; retry from Releem or contact support with the task id. | + + +--- + +## Success and non-failure notes + + +| Scenario | Exit code | Troubleshooting steps | +| ------------------------- | ----------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Success | **0** | Task status **1**. Output lists `Statement N successful:` for each applied DDL. No action required. | +| Non-InnoDB storage engine | *(none — warning only)* | Output may include `warning: storage engine is ...` without failing the task. Prefer InnoDB for online DDL and backups; plan manual change if you rely on MyISAM or other engines. | + + +--- + +## Quick reference: exit code summary + + +| Exit code | Meaning | +| --------- | ----------------------------------------------------- | +| **0** | Success | +| **2** | Invalid task payload | +| **3** | Empty change list | +| **4** | Syntax validation failed | +| **5** | No online DDL or pt-osc path | +| **6** | Pre-change backup blocked (no PITR) | +| **7** | Backup or DDL execution failed (see sub-tables above) | +| **8** | No statements executed | +| **10** | `enable_exec_ddl` is false | + + +--- + +## Where to look next + +1. Task output in the Releem portal for the exact `Statement N failed:` line. +2. Agent logs with `debug = true` (commands, preflight SQL, tool stderr). +3. MySQL server error log for the time of the failure. +4. [task-type-6-schema-changes.md](./task-type-6-schema-changes.md) for technical behavior and safety limits. + From 33349dcfa1ef5a562d97a793950e552c00962d4f Mon Sep 17 00:00:00 2001 From: Dmitry Kochetov Date: Sat, 23 May 2026 20:04:09 +0400 Subject: [PATCH 03/10] fixed --- docs/getting-started/schema-optimization.md | 2 + .../schema-change-troubleshooting.md | 43 ++++++++++++------- sidebars.js | 1 + 3 files changed, 30 insertions(+), 16 deletions(-) diff --git a/docs/getting-started/schema-optimization.md b/docs/getting-started/schema-optimization.md index 91ef689..29bc6f8 100644 --- a/docs/getting-started/schema-optimization.md +++ b/docs/getting-started/schema-optimization.md @@ -26,6 +26,8 @@ Schema optimization helps you detect and fix: 4. Test changes in a development environment first 5. Execute the SQL on your production database during low-traffic periods +If an automatic schema change fails in Releem, use the [Schema Change Troubleshooting](/releem-agent/schema-change-troubleshooting) guide to match the error to the next action. + For detailed information about each type of schema check and comprehensive best practices, see the [MySQL Database Schema Checks](https://releem.com/blog/mysql-database-schema-checks) article. Schema optimization is essential for maintaining long-term database health and performance as your application grows. diff --git a/docs/releem-agent/schema-change-troubleshooting.md b/docs/releem-agent/schema-change-troubleshooting.md index 2945afa..341e813 100644 --- a/docs/releem-agent/schema-change-troubleshooting.md +++ b/docs/releem-agent/schema-change-troubleshooting.md @@ -1,16 +1,31 @@ -# Schema change troubleshooting guide +--- +id: schema-change-troubleshooting +title: Schema Change Troubleshooting +--- + +# Schema Change Troubleshooting + +This guide helps you troubleshoot failed **automatic schema changes** executed by the Releem Agent. Use it when Releem cannot apply an index or table change automatically and the Releem Dashboard shows a failed task. -This guide covers failure scenarios for **automatic schema changes** executed by the Releem Agent. When a change fails, check the task output in the Releem portal and match the **exit code** and message to the table below. +When a change fails, open the failed task in the Releem Dashboard and check: -Exit codes are reported as `task_exit_code` on the task status sent back to Releem. A task with **status 4** failed; **status 1** with exit code **0** succeeded. +- **Apply Index Error** - the detailed message, usually including `Statement N failed: ...`. +- **Agent logs** - useful when the dashboard message is not enough. See [How to Check Releem Agent Logs](/releem-agent/how-to-check-logs). -For configuration prerequisites, see [user-guide-task-automation.md](./user-guide-task-automation.md). +## Before you retry + +1. Read the exact output in the Releem Dashboard. +2. Match the message to the table below. +3. Fix the server-side issue first. Retrying without changing anything usually fails again. +4. If the error says the payload is invalid or empty, contact Releem support with the task id. + +Automatic schema changes are intended for environments where the Releem Agent is allowed to make DDL changes. The Agent must have enough MySQL privileges, access to the configured backup tools, and `enable_exec_ddl = true` in `/opt/releem/releem.conf` when automatic DDL execution is enabled. +For configuration prerequisites, see [Automatic Schema Changes](releem-agent/automatic-schema-changes). --- ## Exit codes set before execution starts - | Scenario | Exit code | Troubleshooting steps | | --------------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Schema change execution disabled | **10** | Set `enable_exec_ddl = true` in `/opt/releem/releem.conf` (or your config path), restart the agent, and retry the change from Releem. | @@ -24,7 +39,6 @@ For configuration prerequisites, see [user-guide-task-automation.md](./user-guid These stop the task before any DDL or backup runs on the server. - | Scenario | Exit code | Troubleshooting steps | | ----------------------------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | DDL failed syntax validation | **4** | Fix the SQL in Releem (or cancel and recreate the change). The task output includes `syntax validation failed` and any `syntax_error` detail from analysis. Do not retry the same statement until the DDL is corrected. | @@ -38,12 +52,13 @@ These stop the task before any DDL or backup runs on the server. All rows below use exit code **7**. The task output includes `Statement N failed:` followed by the underlying error. Enable `debug = true` in `releem.conf` and restart the agent for detailed command logs (passwords are masked). -### Disk space and filesystem capacity +Because exit code **7** has several possible causes, use the message text to choose the correct row below. For example, `online DDL preflight failed on test table` and `pt-online-schema-change dry-run failed` are different problems even though both return exit code **7**. +### Disk space and filesystem capacity | Scenario | Exit code | Troubleshooting steps | | -------------------------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Insufficient space on MySQL datadir | **7** | Message contains `insufficient datadir free space` (must stay **above 10%** free) or `insufficient datadir capacity` (projected use after change must stay **at or below 90%**). Free space on the datadir filesystem, archive or drop unused data, or shrink large tables before retrying. Only for emergencies: set `disable_space_checks = true` in `releem.conf` if your team accepts skipping these checks. | +| Insufficient space on MySQL datadir | **7** | Message contains `insufficient datadir free space` (must stay **above 10%** free) or `insufficient datadir capacity` (projected use after change must stay **at or below 90%**). Free space on the datadir filesystem, archive or drop unused data, or shrink large tables before retrying. Only for emergencies, set `disable_space_checks = true` in `releem.conf` if your team accepts skipping these checks. | | Insufficient space in backup directory | **7** | Message contains `insufficient disk space: required` under `backup_dir`. Free space on the volume that holds `backup_dir` (default `/tmp/backups`), point `backup_dir` to a larger filesystem, or lower `backup_space_buffer` only if you accept less safety margin. | | Cannot read datadir or table size | **7** | Messages such as `failed to resolve datadir`, `datadir is empty`, `failed to get table size`, or `failed to check datadir filesystem capacity`. Verify the agent MySQL user can run `SHOW VARIABLES LIKE 'datadir'` and query `information_schema.TABLES` for the target schema and table. | | Cannot check backup directory | **7** | `failed to check disk space` or `failed to create backup directory`. Ensure `backup_dir` exists, is writable by the agent process, and is on a filesystem the host can stat. | @@ -51,7 +66,6 @@ All rows below use exit code **7**. The task output includes `Statement N failed ### Pre-change backup - | Scenario | Exit code | Troubleshooting steps | | ----------------------------------- | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | mysqldump backup failed | **7** | Message contains `backup failed` and `mysqldump failed`. Install `mysqldump`, set `mysqldump_path` if needed, confirm `mysql_host` / `mysql_user` / `mysql_password` in `releem.conf`, and ensure the user can dump the target table. Run the same mysqldump manually as the agent user to reproduce. | @@ -62,7 +76,6 @@ All rows below use exit code **7**. The task output includes `Statement N failed ### Online DDL (including dry-run on test table) - | Scenario | Exit code | Troubleshooting steps | | --------------------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Online DDL preflight (dry-run) failed | **7** | Message contains `online DDL preflight failed on test table`. The agent clones the table into `online_ddl_test_schema` (default `releem_online_ddl_test`) and runs the DDL there first. Grant `CREATE`, `DROP`, `INDEX`, `ALTER` on that schema; confirm the DDL is valid for an empty copy (same engine/structure). Fix incompatible DDL or use a change Releem routes to `pt-online-schema-change`. | @@ -74,10 +87,9 @@ All rows below use exit code **7**. The task output includes `Statement N failed ### pt-online-schema-change - | Scenario | Exit code | Troubleshooting steps | | ---------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| pt-osc dry-run failed | **7** | Message contains `pt-online-schema-change dry-run failed`. Install [Percona Toolkit](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html), set `ptosc_path`, grant privileges in the user guide (`SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, `TRIGGER` on `*.`* when required). Run `pt-online-schema-change --dry-run` manually with the same connection settings. | +| pt-osc dry-run failed | **7** | Message contains `pt-online-schema-change dry-run failed`. Install [Percona Toolkit](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html), set `ptosc_path`, and grant the required permissions for the target table. Depending on your MySQL version and topology, pt-osc may require privileges such as `SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, `TRIGGER`. Run `pt-online-schema-change --dry-run` manually with the same connection settings. | | pt-osc execute failed | **7** | Dry-run passed but `pt-online-schema-change failed` on execute. Check pt-osc output in logs (triggers, replicas, disk, permissions). Resolve replica lag or tool errors before retrying. | | pt-osc configuration missing | **7** | `mysql_host is required for pt-online-schema-change` or `config is required for pt-online-schema-change`. Complete MySQL settings in `releem.conf`. | @@ -134,8 +146,7 @@ All rows below use exit code **7**. The task output includes `Statement N failed ## Where to look next -1. Task output in the Releem portal for the exact `Statement N failed:` line. -2. Agent logs with `debug = true` (commands, preflight SQL, tool stderr). +1. The output in the Releem portal for the exact `Statement N failed:` line. +2. Agent logs with `debug = true` in `/opt/releem/releem.conf` (commands, preflight SQL, tool stderr). 3. MySQL server error log for the time of the failure. -4. [task-type-6-schema-changes.md](./task-type-6-schema-changes.md) for technical behavior and safety limits. - +4. Releem support, if the payload is invalid, the message does not match any row above, or the same task keeps failing after the server-side issue is fixed. diff --git a/sidebars.js b/sidebars.js index 62d0c72..33b15e8 100644 --- a/sidebars.js +++ b/sidebars.js @@ -93,6 +93,7 @@ const sidebars = { 'query-optimization/enable-sql-query-optimization', 'query-optimization/disable-sql-query-optimization', 'query-optimization/prepared-statements-issue', + 'query-optimization/schema-change-troubleshooting', ], }, { From e603a79b31a52b522e6226f5cddd57966c28440b Mon Sep 17 00:00:00 2001 From: Dmitry Kochetov Date: Sat, 23 May 2026 20:07:23 +0400 Subject: [PATCH 04/10] fixed --- .../automatic-schema-changes.md | 0 .../schema-change-troubleshooting.md | 2 +- sidebars.js | 3 ++- 3 files changed, 3 insertions(+), 2 deletions(-) rename docs/{releem-agent => query-optimization}/automatic-schema-changes.md (100%) rename docs/{releem-agent => query-optimization}/schema-change-troubleshooting.md (99%) diff --git a/docs/releem-agent/automatic-schema-changes.md b/docs/query-optimization/automatic-schema-changes.md similarity index 100% rename from docs/releem-agent/automatic-schema-changes.md rename to docs/query-optimization/automatic-schema-changes.md diff --git a/docs/releem-agent/schema-change-troubleshooting.md b/docs/query-optimization/schema-change-troubleshooting.md similarity index 99% rename from docs/releem-agent/schema-change-troubleshooting.md rename to docs/query-optimization/schema-change-troubleshooting.md index 341e813..906211b 100644 --- a/docs/releem-agent/schema-change-troubleshooting.md +++ b/docs/query-optimization/schema-change-troubleshooting.md @@ -20,7 +20,7 @@ When a change fails, open the failed task in the Releem Dashboard and check: 4. If the error says the payload is invalid or empty, contact Releem support with the task id. Automatic schema changes are intended for environments where the Releem Agent is allowed to make DDL changes. The Agent must have enough MySQL privileges, access to the configured backup tools, and `enable_exec_ddl = true` in `/opt/releem/releem.conf` when automatic DDL execution is enabled. -For configuration prerequisites, see [Automatic Schema Changes](releem-agent/automatic-schema-changes). +For configuration prerequisites, see [Automatic Schema Changes](query-optimization/automatic-schema-changes). --- diff --git a/sidebars.js b/sidebars.js index 33b15e8..d77b0c5 100644 --- a/sidebars.js +++ b/sidebars.js @@ -93,7 +93,8 @@ const sidebars = { 'query-optimization/enable-sql-query-optimization', 'query-optimization/disable-sql-query-optimization', 'query-optimization/prepared-statements-issue', - 'query-optimization/schema-change-troubleshooting', + 'query-optimization/automatic-schema-changes', + 'query-optimization/schema-change-troubleshooting', ], }, { From 419058497b3a61dd36c3aef73d290b5c705ce2c6 Mon Sep 17 00:00:00 2001 From: Gabriel Ciciliani Date: Tue, 9 Jun 2026 08:39:54 -0300 Subject: [PATCH 05/10] Removing exit code references and including error msgs --- .../schema-change-troubleshooting.md | 130 +++++------------- 1 file changed, 31 insertions(+), 99 deletions(-) diff --git a/docs/query-optimization/schema-change-troubleshooting.md b/docs/query-optimization/schema-change-troubleshooting.md index 906211b..6f902d7 100644 --- a/docs/query-optimization/schema-change-troubleshooting.md +++ b/docs/query-optimization/schema-change-troubleshooting.md @@ -24,129 +24,61 @@ For configuration prerequisites, see [Automatic Schema Changes](query-optimizati --- -## Exit codes set before execution starts +## Errors before execution starts -| Scenario | Exit code | Troubleshooting steps | -| --------------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Schema change execution disabled | **10** | Set `enable_exec_ddl = true` in `/opt/releem/releem.conf` (or your config path), restart the agent, and retry the change from Releem. | -| Invalid or malformed task payload | **2** | This is not fixable on the server alone—the task JSON from Releem is invalid or missing required fields (`schema_name`, `ddl_statement`, `analysis_results.schema_name`, `analysis_results.table_name`). Contact Releem support with the task id; retry after the platform resends a valid payload. | -| Empty schema change list | **3** | The task contained no statements to run. Retry from Releem or contact support if the change should have been scheduled. | +| Scenario | Troubleshooting steps | +| --------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Schema change execution disabled | Set `enable_exec_ddl = true` in `/opt/releem/releem.conf` (or your config path), restart the agent, and retry the change from Releem.| --- -## Exit codes set during validation (per statement) +## Errors during validation (per statement) These stop the task before any DDL or backup runs on the server. -| Scenario | Exit code | Troubleshooting steps | -| ----------------------------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| DDL failed syntax validation | **4** | Fix the SQL in Releem (or cancel and recreate the change). The task output includes `syntax validation failed` and any `syntax_error` detail from analysis. Do not retry the same statement until the DDL is corrected. | -| No safe execution method | **5** | Releem analysis marked the change as neither online DDL nor `pt-online-schema-change` safe (`ok_online_ddl` and `ok_pt_osc` both false). Revise the change (smaller scope, different operation), use a maintenance window with manual DDL, or ask Releem why the statement was classified as blocking-only. | -| Pre-change backup required but PITR unavailable | **6** | The change requested a backup before DDL, but point-in-time recovery is not available on this instance (binlog/archiving, managed-service PITR, etc.). Enable PITR on the server or disable the pre-change backup requirement for this change in Releem if policy allows. | +| Scenario | Troubleshooting steps | +| ----------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| No safe execution method | Releem analysis marked the change as neither online DDL nor `pt-online-schema-change` safe. This means that the change cannot run without temporarily blocking the affected tables, and thus, will not be executed automatically. A maintenance window for manual execution is required. Contact Releem for more details about this scenario. | +| Pre-change backup required but PITR unavailable | A table backup before the schema change is executed was requested, but point-in-time recovery is not available on this instance (binary log is not enabled or the retention window is too small). Enable the binary log on the server by configuring `log_bin` and make sure `expire_log_days` is greater equal or greater than 2. Alternatively, disable the pre-change backup requirement for this change. | --- -## Exit code 7 — execution or backup failed - -All rows below use exit code **7**. The task output includes `Statement N failed:` followed by the underlying error. Enable `debug = true` in `releem.conf` and restart the agent for detailed command logs (passwords are masked). - -Because exit code **7** has several possible causes, use the message text to choose the correct row below. For example, `online DDL preflight failed on test table` and `pt-online-schema-change dry-run failed` are different problems even though both return exit code **7**. +## Errors on Backup or Execution ### Disk space and filesystem capacity -| Scenario | Exit code | Troubleshooting steps | -| -------------------------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Insufficient space on MySQL datadir | **7** | Message contains `insufficient datadir free space` (must stay **above 10%** free) or `insufficient datadir capacity` (projected use after change must stay **at or below 90%**). Free space on the datadir filesystem, archive or drop unused data, or shrink large tables before retrying. Only for emergencies, set `disable_space_checks = true` in `releem.conf` if your team accepts skipping these checks. | -| Insufficient space in backup directory | **7** | Message contains `insufficient disk space: required` under `backup_dir`. Free space on the volume that holds `backup_dir` (default `/tmp/backups`), point `backup_dir` to a larger filesystem, or lower `backup_space_buffer` only if you accept less safety margin. | -| Cannot read datadir or table size | **7** | Messages such as `failed to resolve datadir`, `datadir is empty`, `failed to get table size`, or `failed to check datadir filesystem capacity`. Verify the agent MySQL user can run `SHOW VARIABLES LIKE 'datadir'` and query `information_schema.TABLES` for the target schema and table. | -| Cannot check backup directory | **7** | `failed to check disk space` or `failed to create backup directory`. Ensure `backup_dir` exists, is writable by the agent process, and is on a filesystem the host can stat. | +| Scenario | Error message | Troubleshooting steps | +| -------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Insufficient space on MySQL datadir | `insufficient datadir free space: ... required >10%` or `insufficient datadir capacity for table change: projected usage ... exceeds 90% limit` | Free space must stay **above 10%** and projected use after the change must stay **at or below 90%**. Free space on the datadir filesystem, archive or drop unused data, or shrink large tables before retrying. This check can be disabled by setting `disable_space_checks = true` in `releem.conf` although **it is not recommended**. It should be done as a last resort and only temporarily. | +| Insufficient space in backup directory | `backup failed: insufficient disk space: required ... available ...` | Free space on the volume that holds `backup_dir` (default `/tmp/backups`), point `backup_dir` to a larger filesystem, or lower `backup_space_buffer` only if you accept less safety margin. | +| Cannot read datadir or table size | `failed to resolve datadir`, `datadir is empty`, `failed to get table size`, `failed to check datadir filesystem capacity`, or `invalid datadir filesystem size` | Verify that the agent database user has the necessary permissions on the target table. Check [Automatic schema changes in the Releem Agent](http://google.com) for more details.| +| Cannot check backup directory | `failed to check disk space` or `failed to create backup directory` | Ensure `backup_dir` exists and is accessible by the agent process . | ### Pre-change backup -| Scenario | Exit code | Troubleshooting steps | -| ----------------------------------- | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| mysqldump backup failed | **7** | Message contains `backup failed` and `mysqldump failed`. Install `mysqldump`, set `mysqldump_path` if needed, confirm `mysql_host` / `mysql_user` / `mysql_password` in `releem.conf`, and ensure the user can dump the target table. Run the same mysqldump manually as the agent user to reproduce. | -| XtraBackup backup or prepare failed | **7** | Message contains `xtrabackup backup failed` or `xtrabackup prepare failed`. Install a compatible **xtrabackup** (or **mariabackup** if your deployment maps it via `xtrabackup_path`), fix `xtrabackup_path`, and verify backup user privileges. Review tool output in agent logs with `debug = true`. | -| Backup configuration missing | **7** | `mysql_host is required for backup` or `config is required for backup`. Set MySQL connection settings in `releem.conf` the same way as for normal agent monitoring. | -| Backup size estimate failed | **7** | `failed to estimate backup size`. Check that the target table exists and the agent user can read `information_schema.TABLES`. | +| Scenario | Error message | Troubleshooting steps | +| ----------------------------------- | ------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| mysqldump backup failed | `backup failed: mysqldump failed: ...` | Make sure `mysqldump` is installed on the server and available at `mysqldump_path`. Confirm the agent database user has access to the target table. | +| XtraBackup `backup` or `prepare` failed | `backup failed: xtrabackup backup failed: ...` or `backup failed: xtrabackup prepare failed: ...` | Install a compatible version of **xtrabackup** (or **mariabackup** in case the target host is running MariaDB) and confirm the tool is available at `xtrabackup_path`. Verify the agent database user has all necessary privileges. | +| Backup size estimate failed | `failed to estimate backup size: ...` | Check that the target table still exists. It is possible that the table was renamed or dropped after the recommended change was generated. | ### Online DDL (including dry-run on test table) -| Scenario | Exit code | Troubleshooting steps | -| --------------------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Online DDL preflight (dry-run) failed | **7** | Message contains `online DDL preflight failed on test table`. The agent clones the table into `online_ddl_test_schema` (default `releem_online_ddl_test`) and runs the DDL there first. Grant `CREATE`, `DROP`, `INDEX`, `ALTER` on that schema; confirm the DDL is valid for an empty copy (same engine/structure). Fix incompatible DDL or use a change Releem routes to `pt-online-schema-change`. | -| Online DDL failed on production table | **7** | Message contains `schema change execution failed` after preflight succeeded. Often metadata locks, unsupported `ALGORITHM`/`LOCK`, or replication restrictions. Check MySQL error in agent logs; retry in a low-traffic window; resolve blocking sessions. The agent does **not** fall back to pt-osc after online DDL fails. | -| Test schema cannot be created | **7** | `test schema is required`, `failed to create test schema`, or `failed to create test table`. Set `online_ddl_test_schema` if the default name conflicts; grant DDL on that schema; ensure disk space for the empty clone. | -| DDL shape not supported for online path | **7** | `unsupported DDL for online clauses` or `could not locate target table in DDL statement`. Use `ALTER TABLE` or supported `CREATE INDEX` forms; ensure the statement references the analyzed `schema.table`. | -| Lock wait timeout | **7** | Online DDL sets `lock_wait_timeout = 20`. If errors mention lock wait or metadata locks, clear blocking transactions and retry, or use a maintenance window / pt-osc path if Releem allows it. | +| Scenario | Error message | Troubleshooting steps | +| --------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Online DDL preflight (dry-run) failed | `schema change execution failed: online DDL preflight failed on test table ...` | The agent clones the table into `online_ddl_test_schema` (default `releem_online_ddl_test`) and runs the DDL there first. Make sure the agent database user has the necessary permissions. Check [Automatic schema changes in the Releem Agent](http://google.com) for more details. | +| Online DDL failed on production table | `schema change execution failed: ...` (after preflight succeeded) | An unexpected situation caused the backup to fail. Check the agent log for additional errors and contact Releem support. | +| Test schema cannot be created | `schema change execution failed: test schema is required for online DDL preflight`, `... failed to create test schema ...`, or `... failed to create test table ...` |Make sure the agent database user has the necessary permissions. Check [Automatic schema changes in the Releem Agent](http://google.com) for more details | +| Lock wait timeout | `failed to set session lock_wait_timeout: ...`, or `schema change execution failed: ...` mentioning lock wait / metadata locks | Online DDL sets `lock_wait_timeout = 20`. If errors mention lock wait or metadata locks, clear blocking transactions and retry, or retry execution during a maintenance window. | ### pt-online-schema-change -| Scenario | Exit code | Troubleshooting steps | -| ---------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| pt-osc dry-run failed | **7** | Message contains `pt-online-schema-change dry-run failed`. Install [Percona Toolkit](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html), set `ptosc_path`, and grant the required permissions for the target table. Depending on your MySQL version and topology, pt-osc may require privileges such as `SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, `TRIGGER`. Run `pt-online-schema-change --dry-run` manually with the same connection settings. | -| pt-osc execute failed | **7** | Dry-run passed but `pt-online-schema-change failed` on execute. Check pt-osc output in logs (triggers, replicas, disk, permissions). Resolve replica lag or tool errors before retrying. | -| pt-osc configuration missing | **7** | `mysql_host is required for pt-online-schema-change` or `config is required for pt-online-schema-change`. Complete MySQL settings in `releem.conf`. | - - -### Other execution errors (exit code 7) - - -| Scenario | Exit code | Troubleshooting steps | -| -------------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Missing table name | **7** | `table name is required for schema change execution`. Internal/task payload issue—contact Releem support with the task id. | -| Failed to parse table name | **7** | `failed to parse table name`. Ensure `analysis_results.schema_name` and `analysis_results.table_name` match the real object and use a valid `schema.table` form. | - - ---- - -## Exit code 8 — no statements executed - - -| Scenario | Exit code | Troubleshooting steps | -| -------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| No schema changes executed | **8** | Task output includes `No schema changes were executed.` This is returned when the loop finishes without applying any statement (unusual if earlier validation passed). Review full task output and agent logs; retry from Releem or contact support with the task id. | - - ---- - -## Success and non-failure notes - - -| Scenario | Exit code | Troubleshooting steps | -| ------------------------- | ----------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Success | **0** | Task status **1**. Output lists `Statement N successful:` for each applied DDL. No action required. | -| Non-InnoDB storage engine | *(none — warning only)* | Output may include `warning: storage engine is ...` without failing the task. Prefer InnoDB for online DDL and backups; plan manual change if you rely on MyISAM or other engines. | - - ---- - -## Quick reference: exit code summary - - -| Exit code | Meaning | -| --------- | ----------------------------------------------------- | -| **0** | Success | -| **2** | Invalid task payload | -| **3** | Empty change list | -| **4** | Syntax validation failed | -| **5** | No online DDL or pt-osc path | -| **6** | Pre-change backup blocked (no PITR) | -| **7** | Backup or DDL execution failed (see sub-tables above) | -| **8** | No statements executed | -| **10** | `enable_exec_ddl` is false | - - ---- - -## Where to look next - -1. The output in the Releem portal for the exact `Statement N failed:` line. -2. Agent logs with `debug = true` in `/opt/releem/releem.conf` (commands, preflight SQL, tool stderr). -3. MySQL server error log for the time of the failure. -4. Releem support, if the payload is invalid, the message does not match any row above, or the same task keeps failing after the server-side issue is fixed. +| Scenario | Error message | Troubleshooting steps | +| ---------------------------- | ----------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `pt-online-schema-change` dry-run failed | `pt-online-schema-change execution failed: pt-online-schema-change dry-run failed: ...` | Install [Percona Toolkit](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html), set `ptosc_path`, and grant the agent database user the required permissions for the target table. Depending on your MySQL version and topology, pt-online-schema-change may require privileges such as `SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, `TRIGGER`. | +| `pt-online-schema-change` execute failed | `pt-online-schema-change execution failed: pt-online-schema-change failed: ...` | Dry-run passed but the execute step failed. Check pt-online-schema-change output in logs (triggers, replicas, disk, permissions, etc) and contact Releem support. | From 69e36e53e35318e6abdc2ee268def56c1829e363 Mon Sep 17 00:00:00 2001 From: Dmitry Kochetov Date: Thu, 11 Jun 2026 19:51:51 +0400 Subject: [PATCH 06/10] reverted errors code --- .../schema-change-troubleshooting.md | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/docs/query-optimization/schema-change-troubleshooting.md b/docs/query-optimization/schema-change-troubleshooting.md index 6f902d7..e3f9015 100644 --- a/docs/query-optimization/schema-change-troubleshooting.md +++ b/docs/query-optimization/schema-change-troubleshooting.md @@ -28,8 +28,10 @@ For configuration prerequisites, see [Automatic Schema Changes](query-optimizati | Scenario | Troubleshooting steps | | --------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| DDL failed syntax validation | Fix the SQL in Releem (or cancel and recreate the change). The task output includes `syntax validation failed` and any `syntax_error` detail from analysis. Do not retry the same statement until the DDL is corrected. | | Schema change execution disabled | Set `enable_exec_ddl = true` in `/opt/releem/releem.conf` (or your config path), restart the agent, and retry the change from Releem.| - +| Invalid or malformed task payload | This is not fixable on the server alone—the task JSON from Releem is invalid or missing required fields (`schema_name`, `ddl_statement`, `analysis_results.schema_name`, `analysis_results.table_name`). Contact Releem support with the task id; retry after the platform resends a valid payload. | +| Empty schema change list | The task contained no statements to run. Retry from Releem or contact support if the change should have been scheduled. | --- @@ -53,7 +55,7 @@ These stop the task before any DDL or backup runs on the server. | -------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Insufficient space on MySQL datadir | `insufficient datadir free space: ... required >10%` or `insufficient datadir capacity for table change: projected usage ... exceeds 90% limit` | Free space must stay **above 10%** and projected use after the change must stay **at or below 90%**. Free space on the datadir filesystem, archive or drop unused data, or shrink large tables before retrying. This check can be disabled by setting `disable_space_checks = true` in `releem.conf` although **it is not recommended**. It should be done as a last resort and only temporarily. | | Insufficient space in backup directory | `backup failed: insufficient disk space: required ... available ...` | Free space on the volume that holds `backup_dir` (default `/tmp/backups`), point `backup_dir` to a larger filesystem, or lower `backup_space_buffer` only if you accept less safety margin. | -| Cannot read datadir or table size | `failed to resolve datadir`, `datadir is empty`, `failed to get table size`, `failed to check datadir filesystem capacity`, or `invalid datadir filesystem size` | Verify that the agent database user has the necessary permissions on the target table. Check [Automatic schema changes in the Releem Agent](http://google.com) for more details.| +| Cannot read datadir or table size | `failed to resolve datadir`, `datadir is empty`, `failed to get table size`, `failed to check datadir filesystem capacity`, or `invalid datadir filesystem size` | Verify that the agent database user has the necessary permissions on the target table. Check [Automatic Schema Changes](query-optimization/automatic-schema-changes) for more details.| | Cannot check backup directory | `failed to check disk space` or `failed to create backup directory` | Ensure `backup_dir` exists and is accessible by the agent process . | @@ -70,9 +72,9 @@ These stop the task before any DDL or backup runs on the server. | Scenario | Error message | Troubleshooting steps | | --------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Online DDL preflight (dry-run) failed | `schema change execution failed: online DDL preflight failed on test table ...` | The agent clones the table into `online_ddl_test_schema` (default `releem_online_ddl_test`) and runs the DDL there first. Make sure the agent database user has the necessary permissions. Check [Automatic schema changes in the Releem Agent](http://google.com) for more details. | +| Online DDL preflight (dry-run) failed | `schema change execution failed: online DDL preflight failed on test table ...` | The agent clones the table into `online_ddl_test_schema` (default `releem_online_ddl_test`) and runs the DDL there first. Make sure the agent database user has the necessary permissions. Check [Automatic Schema Changes](query-optimization/automatic-schema-changes) for more details. | | Online DDL failed on production table | `schema change execution failed: ...` (after preflight succeeded) | An unexpected situation caused the backup to fail. Check the agent log for additional errors and contact Releem support. | -| Test schema cannot be created | `schema change execution failed: test schema is required for online DDL preflight`, `... failed to create test schema ...`, or `... failed to create test table ...` |Make sure the agent database user has the necessary permissions. Check [Automatic schema changes in the Releem Agent](http://google.com) for more details | +| Test schema cannot be created | `schema change execution failed: test schema is required for online DDL preflight`, `... failed to create test schema ...`, or `... failed to create test table ...` |Make sure the agent database user has the necessary permissions. Check [Automatic Schema Changes](query-optimization/automatic-schema-changes) for more details | | Lock wait timeout | `failed to set session lock_wait_timeout: ...`, or `schema change execution failed: ...` mentioning lock wait / metadata locks | Online DDL sets `lock_wait_timeout = 20`. If errors mention lock wait or metadata locks, clear blocking transactions and retry, or retry execution during a maintenance window. | @@ -82,3 +84,11 @@ These stop the task before any DDL or backup runs on the server. | ---------------------------- | ----------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `pt-online-schema-change` dry-run failed | `pt-online-schema-change execution failed: pt-online-schema-change dry-run failed: ...` | Install [Percona Toolkit](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html), set `ptosc_path`, and grant the agent database user the required permissions for the target table. Depending on your MySQL version and topology, pt-online-schema-change may require privileges such as `SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, `TRIGGER`. | | `pt-online-schema-change` execute failed | `pt-online-schema-change execution failed: pt-online-schema-change failed: ...` | Dry-run passed but the execute step failed. Check pt-online-schema-change output in logs (triggers, replicas, disk, permissions, etc) and contact Releem support. | + + +## No statements executed + + +| Scenario | Troubleshooting steps | +| -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| No schema changes executed | Task output includes `No schema changes were executed.` This is returned when the loop finishes without applying any statement (unusual if earlier validation passed). Review full task output and agent logs; retry from Releem or contact support with the task id. | From 41d4d00397571dba9b90737bce045f0c0cf33eb1 Mon Sep 17 00:00:00 2001 From: Dmitry Kochetov Date: Fri, 12 Jun 2026 16:54:41 +0400 Subject: [PATCH 07/10] Update automatic schema change docs --- docs/getting-started/schema-optimization.md | 6 +- .../automatic-schema-changes.md | 242 +++++++++++++----- .../schema-change-troubleshooting.md | 119 ++++----- docs/releem-agent/configuration-settings.md | 26 ++ 4 files changed, 267 insertions(+), 126 deletions(-) diff --git a/docs/getting-started/schema-optimization.md b/docs/getting-started/schema-optimization.md index 29bc6f8..fe5dcc5 100644 --- a/docs/getting-started/schema-optimization.md +++ b/docs/getting-started/schema-optimization.md @@ -5,7 +5,7 @@ title: Schema Optimization # Schema Optimization -Releem's Schema Optimization feature automatically examines your database structure to identify schema issues that can impact performance, storage efficiency, and data integrity. It is like the schema watchdog that detects problems before they become serious and provides ready-to-use SQL recommendations. +Releem's Schema Optimization feature automatically examines your database structure to identify schema issues that can impact performance, storage efficiency, and data integrity. It is like the schema watchdog that detects problems before they become serious and provides ready-to-use SQL recommendations. ![Releem Schema Optimization](../../assets/images/releem-schema-optimization.png) @@ -26,7 +26,9 @@ Schema optimization helps you detect and fix: 4. Test changes in a development environment first 5. Execute the SQL on your production database during low-traffic periods -If an automatic schema change fails in Releem, use the [Schema Change Troubleshooting](/releem-agent/schema-change-troubleshooting) guide to match the error to the next action. +To let Releem apply approved schema changes for you instead of running SQL manually, follow [Automatic Schema Changes](/query-optimization/automatic-schema-changes). + +If an automatic schema change fails in Releem, use the [Schema Change Troubleshooting](/query-optimization/schema-change-troubleshooting) guide to match the error to the next action. For detailed information about each type of schema check and comprehensive best practices, see the [MySQL Database Schema Checks](https://releem.com/blog/mysql-database-schema-checks) article. diff --git a/docs/query-optimization/automatic-schema-changes.md b/docs/query-optimization/automatic-schema-changes.md index a40a330..e1340d2 100644 --- a/docs/query-optimization/automatic-schema-changes.md +++ b/docs/query-optimization/automatic-schema-changes.md @@ -3,106 +3,232 @@ id: automatic-schema-changes title: Automatic Schema Changes --- -# Automatic schema changes in the Releem Agent +# Automatic Schema Changes -If the **Releem Agent** is already installed and running, you can allow it to execute approved schema changes on the server. Automatic schema changes also include the option of running a pre-change backup, in case a rollback is required. +Releem can apply approved schema recommendations, such as creating indexes, directly from the Releem Dashboard. This is useful when you want Releem to complete the optimization workflow after you review and approve a change. -Both automatic schema changes and backups were implemented with availability in mind, so they will only run if: -* There is enough disk space to perform both, the backup and the schema change -* The backup won't block the affected tables -* Point-in-time restore is possible on the server -* The schema change won't block the affected tables +Automatic schema changes are disabled by default. Enable them only on servers where the Releem Agent is allowed to run DDL statements and where you have checked the backup and disk-space requirements. -The following steps explain how to configure the agent and the database user to handle this new functionality. +Before applying a change, the Releem Agent checks that the operation can be completed safely. Depending on the recommendation and the server, the agent may: ---- +- test the change on a temporary table before touching the production table; +- use native online DDL when MySQL or MariaDB can run the change without blocking the table; +- use `pt-online-schema-change` when the server cannot run the change online by itself; +- create a backup before the change when the recommendation requires it. -## 1. Locate the configuration file +If these checks fail, Releem does not apply the change automatically. See [Schema Change Troubleshooting](/query-optimization/schema-change-troubleshooting) for the next steps. -To enable automatic schema changes, we need to include a few new parameters in the agent configuration file. Below is the default location for Linux servers. Open the file with your favorite editor to add the new parameters. +## Before You Start -| Platform | Default path | -|----------|----------------| -| Linux | `/opt/releem/releem.conf` | +Automatic schema changes require: ---- -## 2. Enable automatic schema (DDL) execution +- an installed and running Releem Agent; +- [SQL Query Optimization](/query-optimization/enable-sql-query-optimization) enabled for the server; +- a MySQL user used by the agent with permissions to apply the approved schema changes; +- enough free disk space in the MySQL data directory and in the backup directory; +- point-in-time recovery when Releem requires a pre-change backup; +- additional tools installed on the same host as the agent when they are needed. -By default the agent **does not** run schema changes from Releem, even when you approve them in the product. For schema changes to be executed on your database server, activate this feature explicitly by setting `enable_exec_ddl` to `true`. +For Linux servers, the default agent configuration file is: -Before running the schema change against the real table, the agent will perform a dry-run of the change against an empty table with the same structure. This is to guarantee that the operation can run successfully with the intenteded strategy. +```bash +/opt/releem/releem.conf +``` -There are some schema changes that the database server can't execute on its own, without blocking the table. An alternative it to use an external tool called [pt-online-schema-change](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html). This tool creates a copy of the table with the intended changes, copies all data to this new table, and swaps it with the existing one, with minimum impact. +## Enable Automatic Schema Changes -[pt-online-schema-change](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html) needs to be available on the server and the location of the tool can be specified in the configuration. +### 1. Install the Tools Used by the Agent -| Setting | Values | What it does | -|---------|--------|----------------| -| `enable_exec_ddl` | `false` (default) or `true` | When `true`, the agent may execute **schema changes** that Releem sends after analysis. When `false`, those changes are not run; the agent reports that execution is disabled. | -| `ptosc_path` | `pt-online-schema-change` | Percona Toolkit is not on `PATH` or you use a non-standard binary location. | -| `online_ddl_test_schema` | `releem_online_ddl_test` (default) or any valid database/schema name | **Optional:** Database/schema name where the agent will test the schema change before executing it against the real table| +Install the tools that match your database type and backup strategy. +Some packages may not be available in the default operating system repositories. If your package manager cannot find a package, add the vendor repository first and then run the installation command again: ---- -## 3. Configure your backup settings +- For Percona Toolkit and Percona XtraBackup, follow [Install percona-release](https://docs.percona.com/percona-software-repositories/installing.html). +- For MariaDB Backup, follow [MariaDB Package Repository Setup and Usage](https://mariadb.com/docs/server/server-management/install-and-upgrade-mariadb/installing-mariadb/binary-packages/mariadb-package-repository-setup-and-usage). +- For MySQL client packages, follow the MySQL repository guide for your platform: [MySQL APT Repository](https://dev.mysql.com/doc/mysql-apt-repo-quick-guide/en/) or [MySQL Yum Repository](https://dev.mysql.com/doc/mysql-yum-repo-quick-guide/en/). -When a pre-change backup is requested, the agent needs tools and extra disk space available on the **same host that runs the agent**. As mentioned before, the Releem agent will look for the best alternative to backup the affected tables before the schema change is executed. +For Debian or Ubuntu: -* If the server and the table supports it, the agent will create a physical backup of the table using `xtrabackup` or `mariabackup` -* If online physical backup is not an option, the agent will use mysqldump to create a logical backup of the data (a `.sql` file with necessary statements to re-create the table and the data) +```bash +sudo apt-get update +sudo apt-get install default-mysql-client percona-toolkit +``` + +For RHEL, CentOS, Rocky Linux, or AlmaLinux: -Releem only proceeds with the backup when **point-in-time recovery** is available for the instance as Releem detects it. If not, the change that required the backup will not run. +```bash +sudo yum install mysql percona-toolkit +``` +On newer releases such as Rocky Linux 8+ or AlmaLinux 8+, use `dnf` instead of `yum`. -| Setting | Values | What it does | -|---------|--------|----------------| -| `backup_dir` | `/tmp/backups` (default) | Directory for backup output. Must exist or be creatable and have enough free space. | -| `mysqldump_path` | `mysqldump` (default) | Full path or name on `PATH` for `mysqldump` (logical backup). | -| `xtrabackup_path` | `xtrabackup` (default) | Full path or name on `PATH` for `xtrabackup` (physical backup when Releem selects that method). | -| `backup_space_buffer` | `20.0` (default) | Extra free space (as a percentage) the agent requires above its estimated backup size before starting a backup. | +For physical backups on MySQL, install Percona XtraBackup. The Percona repository is usually required first. Example: +Debian or Ubuntu: ---- -## 4. Extend database user permissions +```bash +sudo apt-get install percona-xtrabackup-80 +``` + +RHEL, CentOS, Rocky Linux, or AlmaLinux: + +```bash +sudo yum install percona-xtrabackup-80 +``` + +For MariaDB servers, install MariaDB Backup and point `xtrabackup_path` to `mariabackup`: + +Debian or Ubuntu: + +```bash +sudo apt-get install mariadb-backup +``` + +RHEL, CentOS, Rocky Linux, or AlmaLinux: + +```bash +sudo yum install MariaDB-backup +``` + +Package names can differ by operating system and repository. Use the official installation instructions linked at the end of this page for production servers. + +### 2. Configure the Releem Agent + +Open the agent configuration file: + +```bash +sudo nano /opt/releem/releem.conf +``` + +Enable DDL execution and review the paths used by the agent: + +``` +enable_exec_ddl = true + +backup_dir = "/tmp/backups" +ptosc_path = "pt-online-schema-change" +mysqldump_path = "mysqldump" +xtrabackup_path = "xtrabackup" +backup_space_buffer = 20.0 +online_ddl_test_schema = "releem_online_ddl_test" +disable_space_checks = false +``` + +Use full paths when a tool is not available on the agent process `PATH`. For example: + +``` +ptosc_path = "/usr/bin/pt-online-schema-change" +mysqldump_path = "/usr/bin/mysqldump" +xtrabackup_path = "/usr/bin/xtrabackup" +``` + +For MariaDB Backup, set: + +``` +xtrabackup_path = "mariabackup" +``` -The same **MySQL user** the agent already uses for monitoring must have permission to run the approved ALTER statements. Connect to the target database server and run the he GRANT statements below: +### 3. Prepare the Backup Directory + +Create the backup directory and make sure the Releem Agent process can write to it: + +```bash +sudo mkdir -p /tmp/backups +``` + + +### 4. Grant Database Permissions + +Connect to MySQL or MariaDB as an administrator and grant the required permissions to the same database user that the Releem Agent already uses. + +For schema changes on all databases: ```sql --- To allow table ALTERs and New indexes on **any** database -GRANT CREATE, REFERENCES, INDEX, ALTER ON *.* TO `releem`@`127.0.0.1` +GRANT CREATE, REFERENCES, INDEX, ALTER ON *.* TO `releem`@`127.0.0.1`; ``` +Or grant permissions only for one database: + ```sql --- Alternative: grant ALTER permissions *only* on a specific database -GRANT CREATE, REFERENCES, INDEX, ALTER ON `airportdb`.* TO `releem`@`127.0.0.1` +GRANT CREATE, REFERENCES, INDEX, ALTER ON `your_database`.* TO `releem`@`127.0.0.1`; ``` +For the test schema used by the online DDL preflight: + ```sql --- Needed for schema changes dry-runs (note this only affects the test database) -GRANT CREATE, DROP, INDEX, ALTER ON `releem_online_ddl_test`.* TO `releem`@`127.0.0.1` +CREATE DATABASE IF NOT EXISTS `releem_online_ddl_test`; +GRANT CREATE, DROP, INDEX, ALTER ON `releem_online_ddl_test`.* TO `releem`@`127.0.0.1`; ``` -#### Optional - To use pt-online-schema-change as an alternative method when the operation can't be executed online by the server +If Releem may use `pt-online-schema-change`, grant the extra permissions needed by that tool: + ```sql -GRANT SELECT, INSERT, DROP, RELOAD, SUPER, SHOW VIEW, TRIGGER ON *.* TO `releem`@`127.0.0.1` +GRANT SELECT, INSERT, DROP, RELOAD, SUPER, SHOW VIEW, TRIGGER ON *.* TO `releem`@`127.0.0.1`; ``` ---- +On MySQL 8 and newer, use the equivalent dynamic privileges required by your security policy when `SUPER` is not allowed. +Replace `releem` and `127.0.0.1` with the user and host from your agent configuration if they are different. +### 5. Restart the Releem Agent -## 5. Restart the agent +Restart the agent so it reads the new configuration: +```bash +sudo systemctl restart releem-agent +``` -After editing, **restart the Releem Agent** so changes take effect. +If your server uses the legacy service command: ---- +```bash +sudo service releem-agent restart +``` + +### 6. Approve the Schema Change in Releem + +Open the Releem Dashboard, review the query recommendation, and approve the change only when you are ready for the agent to apply it. + +After the task starts, Releem shows the result in the dashboard. If the task fails, open the failed task and use [Schema Change Troubleshooting](/query-optimization/schema-change-troubleshooting). + +## Agent Configuration Reference + +| Setting | Default | Description | +| --- | --- | --- | +| `enable_exec_ddl` | `false` | Enables automatic execution of approved schema changes. Keep it `false` when you want Releem to recommend changes only. | +| `backup_dir` | `/tmp/backups` | Directory where the agent stores logical and physical backups before a schema change. | +| `ptosc_path` | `pt-online-schema-change` | Path to `pt-online-schema-change` from Percona Toolkit. Used when Releem selects that method. | +| `mysqldump_path` | `mysqldump` | Path to `mysqldump`. Used for logical table backups. | +| `xtrabackup_path` | `xtrabackup` | Path to `xtrabackup` or `mariabackup`. Used for physical backups when Releem selects that method. | +| `backup_space_buffer` | `20.0` | Extra free-space percentage required above the estimated backup size. | +| `online_ddl_test_schema` | `releem_online_ddl_test` | Schema where the agent creates temporary tables to test online DDL before applying it to the production table. | +| `disable_space_checks` | `false` | Disables disk-space checks when set to `true`. Use only temporarily and only when you have another capacity check in place. | + +## Additional Tool Installation Notes + +- `mysqldump` is usually included in MySQL client packages. +- `pt-online-schema-change` is included in Percona Toolkit. +- `xtrabackup` is installed from Percona XtraBackup packages. +- `mariabackup` is installed from MariaDB Backup packages and should be used for MariaDB servers. + +After installing tools, check that the agent can find them: + +```bash +which mysqldump +which pt-online-schema-change +which xtrabackup +which mariabackup +``` -## External tools +Use the returned paths in `releem.conf` if needed. -Install **mysqldump**, **XtraBackup**, **mariabackup** and **pt-online-schema-change**as appropriate for your Database server and OS flavor. For more information about how to install these tools, please refer to: +## Documentation Links -* [pt-online-schema-change](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html) -* [xtrabackup](https://docs.percona.com/percona-xtrabackup/2.4/index.html) -* [mariabackup](https://mariadb.com/docs/server/server-usage/backup-and-restore/mariadb-backup/mariadb-backup-overview#installing-mariadb-backup) -* [mysqldump](https://dev.mysql.com/doc/refman/9.7/en/mysqldump.html) +- [Percona Toolkit installation](https://docs.percona.com/percona-toolkit/installation.html) +- [Percona repository setup](https://docs.percona.com/percona-software-repositories/installing.html) +- [pt-online-schema-change documentation](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html) +- [Percona XtraBackup installation](https://docs.percona.com/percona-xtrabackup/8.0/installation.html) +- [MariaDB repository setup](https://mariadb.com/docs/server/server-management/install-and-upgrade-mariadb/installing-mariadb/binary-packages/mariadb-package-repository-setup-and-usage) +- [MariaDB Backup documentation](https://mariadb.com/docs/server/server-usage/backup-and-restore/mariadb-backup/mariadb-backup-overview) +- [MySQL APT Repository guide](https://dev.mysql.com/doc/mysql-apt-repo-quick-guide/en/) +- [MySQL Yum Repository guide](https://dev.mysql.com/doc/mysql-yum-repo-quick-guide/en/) +- [mysqldump documentation](https://dev.mysql.com/doc/refman/8.4/en/mysqldump.html) diff --git a/docs/query-optimization/schema-change-troubleshooting.md b/docs/query-optimization/schema-change-troubleshooting.md index e3f9015..03bc9ec 100644 --- a/docs/query-optimization/schema-change-troubleshooting.md +++ b/docs/query-optimization/schema-change-troubleshooting.md @@ -5,90 +5,77 @@ title: Schema Change Troubleshooting # Schema Change Troubleshooting -This guide helps you troubleshoot failed **automatic schema changes** executed by the Releem Agent. Use it when Releem cannot apply an index or table change automatically and the Releem Dashboard shows a failed task. +Use this guide when an automatic schema change fails in Releem. The failed task usually shows a short message in the Releem Dashboard, and the Releem Agent logs can include more detail. -When a change fails, open the failed task in the Releem Dashboard and check: +Before you retry the task: -- **Apply Index Error** - the detailed message, usually including `Statement N failed: ...`. -- **Agent logs** - useful when the dashboard message is not enough. See [How to Check Releem Agent Logs](/releem-agent/how-to-check-logs). +1. Open the failed task in the Releem Dashboard. +2. Copy the exact error message. +3. Check the matching section below. +4. Fix the server-side issue first. +5. Retry the task from Releem. -## Before you retry +If the dashboard message is not enough, check the agent logs. See [How to Check Releem Agent Logs](/releem-agent/how-to-check-logs). -1. Read the exact output in the Releem Dashboard. -2. Match the message to the table below. -3. Fix the server-side issue first. Retrying without changing anything usually fails again. -4. If the error says the payload is invalid or empty, contact Releem support with the task id. +For setup requirements, see [Automatic Schema Changes](/query-optimization/automatic-schema-changes). -Automatic schema changes are intended for environments where the Releem Agent is allowed to make DDL changes. The Agent must have enough MySQL privileges, access to the configured backup tools, and `enable_exec_ddl = true` in `/opt/releem/releem.conf` when automatic DDL execution is enabled. -For configuration prerequisites, see [Automatic Schema Changes](query-optimization/automatic-schema-changes). +## The Task Did Not Start ---- - -## Errors before execution starts - -| Scenario | Troubleshooting steps | -| --------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| DDL failed syntax validation | Fix the SQL in Releem (or cancel and recreate the change). The task output includes `syntax validation failed` and any `syntax_error` detail from analysis. Do not retry the same statement until the DDL is corrected. | -| Schema change execution disabled | Set `enable_exec_ddl = true` in `/opt/releem/releem.conf` (or your config path), restart the agent, and retry the change from Releem.| -| Invalid or malformed task payload | This is not fixable on the server alone—the task JSON from Releem is invalid or missing required fields (`schema_name`, `ddl_statement`, `analysis_results.schema_name`, `analysis_results.table_name`). Contact Releem support with the task id; retry after the platform resends a valid payload. | -| Empty schema change list | The task contained no statements to run. Retry from Releem or contact support if the change should have been scheduled. | - ---- - -## Errors during validation (per statement) - -These stop the task before any DDL or backup runs on the server. - -| Scenario | Troubleshooting steps | -| ----------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| No safe execution method | Releem analysis marked the change as neither online DDL nor `pt-online-schema-change` safe. This means that the change cannot run without temporarily blocking the affected tables, and thus, will not be executed automatically. A maintenance window for manual execution is required. Contact Releem for more details about this scenario. | -| Pre-change backup required but PITR unavailable | A table backup before the schema change is executed was requested, but point-in-time recovery is not available on this instance (binary log is not enabled or the retention window is too small). Enable the binary log on the server by configuring `log_bin` and make sure `expire_log_days` is greater equal or greater than 2. Alternatively, disable the pre-change backup requirement for this change. | - - ---- - -## Errors on Backup or Execution +| Error in Releem | What it means | What to do | +| --- | --- | --- | +| `schema change execution is disabled` | The agent is not allowed to run DDL statements. | Set `enable_exec_ddl = true` in `/opt/releem/releem.conf`, restart the Releem Agent, and retry the task. | +| `syntax validation failed` | The SQL statement is not valid for the target server. | Do not retry the same task. Fix or recreate the schema recommendation in Releem. | +| `invalid task payload` or missing fields such as `schema_name` or `ddl_statement` | The task data sent to the agent is incomplete. | Contact Releem support with the task id. This cannot be fixed only on the server. | +| `empty schema change list` | The task did not include any statements to run. | Retry from Releem. If the task should contain a change, contact Releem support with the task id. | -### Disk space and filesystem capacity +## Releem Could Not Choose a Safe Method -| Scenario | Error message | Troubleshooting steps | -| -------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Insufficient space on MySQL datadir | `insufficient datadir free space: ... required >10%` or `insufficient datadir capacity for table change: projected usage ... exceeds 90% limit` | Free space must stay **above 10%** and projected use after the change must stay **at or below 90%**. Free space on the datadir filesystem, archive or drop unused data, or shrink large tables before retrying. This check can be disabled by setting `disable_space_checks = true` in `releem.conf` although **it is not recommended**. It should be done as a last resort and only temporarily. | -| Insufficient space in backup directory | `backup failed: insufficient disk space: required ... available ...` | Free space on the volume that holds `backup_dir` (default `/tmp/backups`), point `backup_dir` to a larger filesystem, or lower `backup_space_buffer` only if you accept less safety margin. | -| Cannot read datadir or table size | `failed to resolve datadir`, `datadir is empty`, `failed to get table size`, `failed to check datadir filesystem capacity`, or `invalid datadir filesystem size` | Verify that the agent database user has the necessary permissions on the target table. Check [Automatic Schema Changes](query-optimization/automatic-schema-changes) for more details.| -| Cannot check backup directory | `failed to check disk space` or `failed to create backup directory` | Ensure `backup_dir` exists and is accessible by the agent process . | +| Error in Releem | What it means | What to do | +| --- | --- | --- | +| `no safe execution method` | Releem could not apply the change without blocking the affected table. | Run the change manually during a maintenance window, or contact Releem support to review the recommendation. | +| `pre-change backup required but PITR unavailable` | Releem requires a backup before the change, but point-in-time recovery is not available. | Enable binary logs and keep enough retention for recovery, then retry. For MySQL, check `log_bin` and binary log expiration settings. | +## Disk Space or Filesystem Checks Failed -### Pre-change backup +| Error in Releem | What it means | What to do | +| --- | --- | --- | +| `insufficient datadir free space` or `projected usage ... exceeds 90% limit` | The MySQL data directory does not have enough free space for the schema change. | Free disk space, move data, archive old tables, or retry during a planned maintenance process after adding capacity. | +| `backup failed: insufficient disk space` | The backup directory does not have enough free space. | Free space on the filesystem used by `backup_dir`, move `backup_dir` to a larger volume, or increase available storage. | +| `failed to resolve datadir`, `datadir is empty`, `failed to get table size`, or `invalid datadir filesystem size` | The agent could not inspect the data directory or estimate the affected table size. | Check that the agent can connect to MySQL, read the target table metadata, and access the filesystem information. | +| `failed to check disk space` or `failed to create backup directory` | The agent cannot read or create the configured backup directory. | Create `backup_dir`, fix ownership and permissions, and make sure the filesystem is writable by the agent process. | -| Scenario | Error message | Troubleshooting steps | -| ----------------------------------- | ------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| mysqldump backup failed | `backup failed: mysqldump failed: ...` | Make sure `mysqldump` is installed on the server and available at `mysqldump_path`. Confirm the agent database user has access to the target table. | -| XtraBackup `backup` or `prepare` failed | `backup failed: xtrabackup backup failed: ...` or `backup failed: xtrabackup prepare failed: ...` | Install a compatible version of **xtrabackup** (or **mariabackup** in case the target host is running MariaDB) and confirm the tool is available at `xtrabackup_path`. Verify the agent database user has all necessary privileges. | -| Backup size estimate failed | `failed to estimate backup size: ...` | Check that the target table still exists. It is possible that the table was renamed or dropped after the recommended change was generated. | +Do not disable space checks for normal production use. `disable_space_checks = true` should be used only temporarily and only when you have another capacity check in place. +## Backup Failed -### Online DDL (including dry-run on test table) +| Error in Releem | What it means | What to do | +| --- | --- | --- | +| `mysqldump failed` | The logical backup failed before the schema change. | Install `mysqldump`, set `mysqldump_path` if needed, and confirm that the agent database user can read the target table. | +| `xtrabackup backup failed` or `xtrabackup prepare failed` | The physical backup failed before the schema change. | Install a compatible `xtrabackup` version for MySQL or use `mariabackup` for MariaDB. Set `xtrabackup_path` to the correct binary. | +| `failed to estimate backup size` | The agent could not estimate the table backup size. | Check that the database and table still exist and that the agent database user can read their metadata. | -| Scenario | Error message | Troubleshooting steps | -| --------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Online DDL preflight (dry-run) failed | `schema change execution failed: online DDL preflight failed on test table ...` | The agent clones the table into `online_ddl_test_schema` (default `releem_online_ddl_test`) and runs the DDL there first. Make sure the agent database user has the necessary permissions. Check [Automatic Schema Changes](query-optimization/automatic-schema-changes) for more details. | -| Online DDL failed on production table | `schema change execution failed: ...` (after preflight succeeded) | An unexpected situation caused the backup to fail. Check the agent log for additional errors and contact Releem support. | -| Test schema cannot be created | `schema change execution failed: test schema is required for online DDL preflight`, `... failed to create test schema ...`, or `... failed to create test table ...` |Make sure the agent database user has the necessary permissions. Check [Automatic Schema Changes](query-optimization/automatic-schema-changes) for more details | -| Lock wait timeout | `failed to set session lock_wait_timeout: ...`, or `schema change execution failed: ...` mentioning lock wait / metadata locks | Online DDL sets `lock_wait_timeout = 20`. If errors mention lock wait or metadata locks, clear blocking transactions and retry, or retry execution during a maintenance window. | +After fixing the tool or permission issue, restart the agent if you changed `releem.conf`. +## Online DDL Failed -### pt-online-schema-change +| Error in Releem | What it means | What to do | +| --- | --- | --- | +| `online DDL preflight failed on test table` | The agent tested the change in `online_ddl_test_schema`, and the test failed. | Check the SQL error in the task output. Confirm that the test schema exists and that the agent user has `CREATE`, `DROP`, `INDEX`, and `ALTER` permissions on it. | +| `test schema is required`, `failed to create test schema`, or `failed to create test table` | The agent could not create the temporary schema or table used for the preflight. | Create the schema from [Automatic Schema Changes](/query-optimization/automatic-schema-changes) and grant the required permissions. | +| `lock wait timeout`, `metadata lock`, or `failed to set session lock_wait_timeout` | Another transaction or session is blocking the schema change. | Clear the blocking transaction and retry, or apply the change during a quieter period. | +| `schema change execution failed` after the preflight passed | The test succeeded, but the production change failed. | Check the full task output and agent logs. If the cause is not clear, contact Releem support with the task id and logs. | -| Scenario | Error message | Troubleshooting steps | -| ---------------------------- | ----------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `pt-online-schema-change` dry-run failed | `pt-online-schema-change execution failed: pt-online-schema-change dry-run failed: ...` | Install [Percona Toolkit](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html), set `ptosc_path`, and grant the agent database user the required permissions for the target table. Depending on your MySQL version and topology, pt-online-schema-change may require privileges such as `SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, `TRIGGER`. | -| `pt-online-schema-change` execute failed | `pt-online-schema-change execution failed: pt-online-schema-change failed: ...` | Dry-run passed but the execute step failed. Check pt-online-schema-change output in logs (triggers, replicas, disk, permissions, etc) and contact Releem support. | +## pt-online-schema-change Failed +| Error in Releem | What it means | What to do | +| --- | --- | --- | +| `pt-online-schema-change dry-run failed` | The tool was available, but its dry run failed. | Install or update Percona Toolkit, set `ptosc_path`, and check the permissions required by `pt-online-schema-change`. | +| `pt-online-schema-change failed` | The dry run passed, but the actual execution failed. | Check the agent logs for the tool output. Common causes include missing privileges, trigger conflicts, replication lag, disk space limits, or table changes made after the recommendation was generated. | -## No statements executed +`pt-online-schema-change` may require permissions such as `SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, and `TRIGGER`, depending on the server version and topology. +## The Task Finished Without Applying a Change -| Scenario | Troubleshooting steps | -| -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| No schema changes executed | Task output includes `No schema changes were executed.` This is returned when the loop finishes without applying any statement (unusual if earlier validation passed). Review full task output and agent logs; retry from Releem or contact support with the task id. | +| Error in Releem | What it means | What to do | +| --- | --- | --- | +| `No schema changes were executed` | The task completed without applying any statement. | Review the full task output and agent logs. Retry from Releem, or contact support if the task should have applied a change. | diff --git a/docs/releem-agent/configuration-settings.md b/docs/releem-agent/configuration-settings.md index c533704..b37c7ea 100644 --- a/docs/releem-agent/configuration-settings.md +++ b/docs/releem-agent/configuration-settings.md @@ -94,6 +94,30 @@ query_optimization=false # List of databases for query optimization (comma-separated) databases_query_optimization="" +# Enable automatic execution of approved schema changes +enable_exec_ddl=false + +# Directory for schema-change backups +backup_dir="/tmp/backups" + +# Path to pt-online-schema-change binary +ptosc_path="pt-online-schema-change" + +# Path to mysqldump binary +mysqldump_path="mysqldump" + +# Path to xtrabackup or mariabackup binary +xtrabackup_path="xtrabackup" + +# Extra free-space buffer percentage for backups +backup_space_buffer=20.0 + +# Scratch schema used for Online DDL preflight checks +online_ddl_test_schema="releem_online_ddl_test" + +# Disable disk space checks for schema-change execution +disable_space_checks=false + # Server data storage region - EU or empty releem_region="" ``` @@ -107,6 +131,8 @@ releem_region="" - PostgreSQL monitoring is enabled when `pg_user` and `pg_password` are configured. - Set `query_optimization=true` to enable SQL query optimization features where supported. - Use `databases_query_optimization` to specify which databases to monitor for query optimization (leave empty for all databases). +- Set `enable_exec_ddl=true` only when the Releem Agent is allowed to apply approved schema changes automatically. +- Keep `disable_space_checks=false` for production use unless you have a separate capacity check in place. - The `releem_region` field can be set to `EU` for European data storage or left empty for default storage. ## Restarting the Agent From aa973e87d5288335b3d700d9f3961516be5fbdce Mon Sep 17 00:00:00 2001 From: Dmitry Kochetov Date: Fri, 12 Jun 2026 17:36:45 +0400 Subject: [PATCH 08/10] Document MySQL setup for automatic schema changes --- .../automatic-schema-changes.md | 131 +++++++++++++++++- 1 file changed, 125 insertions(+), 6 deletions(-) diff --git a/docs/query-optimization/automatic-schema-changes.md b/docs/query-optimization/automatic-schema-changes.md index e1340d2..d45e647 100644 --- a/docs/query-optimization/automatic-schema-changes.md +++ b/docs/query-optimization/automatic-schema-changes.md @@ -24,9 +24,10 @@ Automatic schema changes require: - an installed and running Releem Agent; - [SQL Query Optimization](/query-optimization/enable-sql-query-optimization) enabled for the server; +- MySQL or MariaDB configuration that allows Releem to validate the schema change and create a backup when needed; - a MySQL user used by the agent with permissions to apply the approved schema changes; - enough free disk space in the MySQL data directory and in the backup directory; -- point-in-time recovery when Releem requires a pre-change backup; +- binary logs with at least 2 days of retention when Releem requires a pre-change backup; - additional tools installed on the same host as the agent when they are needed. For Linux servers, the default agent configuration file is: @@ -92,7 +93,122 @@ sudo yum install MariaDB-backup Package names can differ by operating system and repository. Use the official installation instructions linked at the end of this page for production servers. -### 2. Configure the Releem Agent +### 2. Configure MySQL for Automatic Schema Changes + +Releem uses MySQL metadata collected by the agent to decide whether a schema change can run automatically. Configure MySQL before enabling DDL execution in the agent. + +#### Enable point-in-time recovery for pre-change backups + +When Releem marks a schema change as requiring a pre-change backup, the change runs only if point-in-time recovery is available. Releem considers point-in-time recovery available when: + +- `log_bin` is `ON`; +- binary log retention is at least 2 days. + +If these values are not available, the Releem Agent skips the schema change instead of applying it without the required recovery option. + +Add the settings to the MySQL or MariaDB server configuration file. Common locations are: + +- `/etc/mysql/mysql.conf.d/mysqld.cnf` for MySQL on Debian or Ubuntu; +- `/etc/mysql/mariadb.conf.d/50-server.cnf` for MariaDB on Debian or Ubuntu; +- `/etc/my.cnf` or a file under `/etc/my.cnf.d/` for RHEL-based distributions. + +For MySQL 8.0 and newer, set binary log retention in seconds: + +```ini +[mysqld] +log_bin=mysql-bin +binlog_expire_logs_seconds=172800 +binlog_format=ROW +``` + +For MySQL 5.7 and MariaDB 10.5 or earlier, use `expire_logs_days`: + +```ini +[mysqld] +log_bin=mysql-bin +expire_logs_days=2 +binlog_format=ROW +``` + +For MariaDB 10.6 and newer, either retention variable can be used. Releem reads `binlog_expire_logs_seconds` when it is available: + +```ini +[mariadb] +log_bin +binlog_expire_logs_seconds=172800 +binlog_format=ROW +``` + +If binary logging is already enabled, keep your existing binary log basename and only adjust retention if it is lower than 2 days. After changing MySQL configuration, restart MySQL or MariaDB. + +Use the command that matches your service name: + +```bash +sudo systemctl restart mysql +sudo systemctl restart mysqld +sudo systemctl restart mariadb +``` + +For managed database services, configure the equivalent database parameter or backup/binlog retention setting in the provider console. Releem must see `log_bin = ON` and retention of at least 2 days in the variables collected by the agent. + +Verify the effective values: + +```sql +SHOW VARIABLES +WHERE Variable_name IN ( + 'log_bin', + 'binlog_expire_logs_seconds', + 'expire_logs_days', + 'binlog_format', + 'datadir' +); +``` + +After changing MySQL settings, let the Releem Agent collect the next snapshot or run: + +```bash +sudo /opt/releem/releem-agent -f +``` + +#### Check table and execution requirements + +Releem checks each approved statement before sending it to the agent: + +- native Online DDL is allowed for InnoDB tables on MySQL 5.7+ or MariaDB 10+ when the statement is valid; +- `pt-online-schema-change` is allowed only for InnoDB tables with a primary key, without triggers, and without referencing foreign keys; +- online physical backup is selected only for InnoDB tables; +- if neither native Online DDL nor `pt-online-schema-change` is safe, Releem does not run the change automatically. + +You can check the target table before approving a change: + +```sql +SELECT ENGINE +FROM information_schema.TABLES +WHERE TABLE_SCHEMA = 'your_database' + AND TABLE_NAME = 'your_table'; + +SHOW INDEX FROM `your_database`.`your_table` WHERE Key_name = 'PRIMARY'; + +SHOW TRIGGERS FROM `your_database` LIKE 'your_table'; + +SELECT TABLE_SCHEMA, TABLE_NAME, CONSTRAINT_NAME +FROM information_schema.KEY_COLUMN_USAGE +WHERE REFERENCED_TABLE_SCHEMA = 'your_database' + AND REFERENCED_TABLE_NAME = 'your_table'; +``` + +#### Check disk-space requirements + +The agent checks disk capacity before execution: + +- the MySQL data directory must keep more than 10% free space; +- projected MySQL data directory usage after the schema change must stay at or below 90%; +- when a backup is required, `backup_dir` must have enough free space for the estimated backup size plus `backup_space_buffer`; +- logical backups use `mysqldump`; physical backups use `xtrabackup` or `mariabackup`. + +Keep `disable_space_checks = false` for production use. Disable it only temporarily and only when you already have another capacity check in place. + +### 3. Configure the Releem Agent Open the agent configuration file: @@ -128,7 +244,7 @@ For MariaDB Backup, set: xtrabackup_path = "mariabackup" ``` -### 3. Prepare the Backup Directory +### 4. Prepare the Backup Directory Create the backup directory and make sure the Releem Agent process can write to it: @@ -137,7 +253,7 @@ sudo mkdir -p /tmp/backups ``` -### 4. Grant Database Permissions +### 5. Grant Database Permissions Connect to MySQL or MariaDB as an administrator and grant the required permissions to the same database user that the Releem Agent already uses. @@ -170,7 +286,7 @@ On MySQL 8 and newer, use the equivalent dynamic privileges required by your sec Replace `releem` and `127.0.0.1` with the user and host from your agent configuration if they are different. -### 5. Restart the Releem Agent +### 6. Restart the Releem Agent Restart the agent so it reads the new configuration: @@ -184,7 +300,7 @@ If your server uses the legacy service command: sudo service releem-agent restart ``` -### 6. Approve the Schema Change in Releem +### 7. Approve the Schema Change in Releem Open the Releem Dashboard, review the query recommendation, and approve the change only when you are ready for the agent to apply it. @@ -231,4 +347,7 @@ Use the returned paths in `releem.conf` if needed. - [MariaDB Backup documentation](https://mariadb.com/docs/server/server-usage/backup-and-restore/mariadb-backup/mariadb-backup-overview) - [MySQL APT Repository guide](https://dev.mysql.com/doc/mysql-apt-repo-quick-guide/en/) - [MySQL Yum Repository guide](https://dev.mysql.com/doc/mysql-yum-repo-quick-guide/en/) +- [MySQL binary logging options](https://dev.mysql.com/doc/refman/8.4/en/replication-options-binary-log.html) +- [MariaDB binary log activation](https://mariadb.com/docs/server/server-management/server-monitoring-logs/binary-log/activating-the-binary-log) +- [MariaDB binary log variables](https://mariadb.com/docs/server/ha-and-performance/standard-replication/replication-and-binary-log-system-variables) - [mysqldump documentation](https://dev.mysql.com/doc/refman/8.4/en/mysqldump.html) From ef3bdca5d2ce15dbc616332539689c60a6a2e027 Mon Sep 17 00:00:00 2001 From: Dmitry Kochetov Date: Fri, 12 Jun 2026 18:08:00 +0400 Subject: [PATCH 09/10] Document schema change troubleshooting by error text --- .../schema-change-troubleshooting.md | 124 +++++++++++------- 1 file changed, 79 insertions(+), 45 deletions(-) diff --git a/docs/query-optimization/schema-change-troubleshooting.md b/docs/query-optimization/schema-change-troubleshooting.md index 03bc9ec..e0914c5 100644 --- a/docs/query-optimization/schema-change-troubleshooting.md +++ b/docs/query-optimization/schema-change-troubleshooting.md @@ -5,77 +5,111 @@ title: Schema Change Troubleshooting # Schema Change Troubleshooting -Use this guide when an automatic schema change fails in Releem. The failed task usually shows a short message in the Releem Dashboard, and the Releem Agent logs can include more detail. +Use this guide when an automatic schema change fails in Releem. -Before you retry the task: +Open the failed task in the Releem Dashboard and check: -1. Open the failed task in the Releem Dashboard. -2. Copy the exact error message. -3. Check the matching section below. -4. Fix the server-side issue first. -5. Retry the task from Releem. +1. the exact **Apply Index Error** message; +2. the detailed error text shown after the main error title; +3. the Releem Agent logs when the dashboard message is not enough. -If the dashboard message is not enough, check the agent logs. See [How to Check Releem Agent Logs](/releem-agent/how-to-check-logs). +Some dashboard views show a generic title such as `Schema changes execution failed` and then append the detailed error sent by the agent. Match the detailed text against the tables below. For logs, see [How to Check Releem Agent Logs](/releem-agent/how-to-check-logs). For setup requirements, see [Automatic Schema Changes](/query-optimization/automatic-schema-changes). -## The Task Did Not Start +## Errors Before Execution Starts -| Error in Releem | What it means | What to do | +These errors happen before the agent runs a backup or DDL statement. + +| Error name | Error text contains | What to do | | --- | --- | --- | -| `schema change execution is disabled` | The agent is not allowed to run DDL statements. | Set `enable_exec_ddl = true` in `/opt/releem/releem.conf`, restart the Releem Agent, and retry the task. | -| `syntax validation failed` | The SQL statement is not valid for the target server. | Do not retry the same task. Fix or recreate the schema recommendation in Releem. | -| `invalid task payload` or missing fields such as `schema_name` or `ddl_statement` | The task data sent to the agent is incomplete. | Contact Releem support with the task id. This cannot be fixed only on the server. | -| `empty schema change list` | The task did not include any statements to run. | Retry from Releem. If the task should contain a change, contact Releem support with the task id. | +| Automatic schema changes are disabled | `schema change execution is disabled by config` | Set `enable_exec_ddl = true` in `/opt/releem/releem.conf`, restart the Releem Agent, and retry the task from Releem. | +| Task data is invalid | `taskdetails JSON`, `schema_name is required`, `ddl_statement is required`, `analysis_results.schema_name is required`, or `analysis_results.table_name is required` | Contact Releem support with the task id. The task payload sent to the agent is incomplete or malformed and cannot be fixed only on the database server. | +| No schema change was sent | `Invalid task_details: empty schema change list` | Retry from Releem. If the recommendation should contain a change, contact Releem support with the task id. | +| SQL statement is invalid | `Statement N skipped: syntax validation failed` | Do not retry the same task. Fix or recreate the schema recommendation so the DDL is valid for the target MySQL or MariaDB version. | + +## Releem Did Not Find a Safe Automatic Method -## Releem Could Not Choose a Safe Method +These errors happen during per-statement validation. The agent stops before running DDL. -| Error in Releem | What it means | What to do | +| Error name | Error text contains | What to do | | --- | --- | --- | -| `no safe execution method` | Releem could not apply the change without blocking the affected table. | Run the change manually during a maintenance window, or contact Releem support to review the recommendation. | -| `pre-change backup required but PITR unavailable` | Releem requires a backup before the change, but point-in-time recovery is not available. | Enable binary logs and keep enough retention for recovery, then retry. For MySQL, check `log_bin` and binary log expiration settings. | +| Releem cannot run this change online | `Statement N skipped: cannot be executed without blocking the table` | Apply the change manually during a maintenance window, or ask Releem support to review the recommendation. Releem did not allow native Online DDL or `pt-online-schema-change` for this statement. | +| Point-in-time recovery is not ready | `Point-in-time recovery is not possible` | Enable binary logging and keep at least 2 days of binary log retention, then let the agent collect a fresh snapshot and retry. If this is a managed database, enable the provider's equivalent PITR/binlog retention setting. | + +## Backup or Execution Failed -## Disk Space or Filesystem Checks Failed +These errors mean the agent started the operational phase and then the backup or DDL execution failed. The dashboard may show `Schema changes execution failed` with the detailed agent error. -| Error in Releem | What it means | What to do | +Use the text after `Statement N failed:` or after the dashboard prefix to identify the exact issue. + +### Disk Space and Filesystem Checks + +| Error name | Error text contains | What to do | | --- | --- | --- | -| `insufficient datadir free space` or `projected usage ... exceeds 90% limit` | The MySQL data directory does not have enough free space for the schema change. | Free disk space, move data, archive old tables, or retry during a planned maintenance process after adding capacity. | -| `backup failed: insufficient disk space` | The backup directory does not have enough free space. | Free space on the filesystem used by `backup_dir`, move `backup_dir` to a larger volume, or increase available storage. | -| `failed to resolve datadir`, `datadir is empty`, `failed to get table size`, or `invalid datadir filesystem size` | The agent could not inspect the data directory or estimate the affected table size. | Check that the agent can connect to MySQL, read the target table metadata, and access the filesystem information. | -| `failed to check disk space` or `failed to create backup directory` | The agent cannot read or create the configured backup directory. | Create `backup_dir`, fix ownership and permissions, and make sure the filesystem is writable by the agent process. | +| MySQL data directory has too little free space | `insufficient datadir free space` | Free space on the filesystem that contains MySQL `datadir`, then retry. The agent requires more than 10% free space before it starts the change. | +| MySQL data directory would become too full | `insufficient datadir capacity for table change` | Add storage, archive data, drop unused data, or reduce the target table size before retrying. Projected datadir usage after the change must stay at or below 90%. | +| Agent cannot read the MySQL data directory | `failed to resolve datadir`, `datadir is empty`, `failed to check datadir filesystem capacity`, or `invalid datadir filesystem size` | Check that MySQL returns `SHOW VARIABLES LIKE 'datadir'`, that the path exists on the host where the agent runs, and that the agent can read filesystem capacity for that path. | +| Agent cannot estimate the target table size | `failed to get table size` | Check that the table still exists and that the agent MySQL user can read `information_schema.TABLES` for the target schema and table. | + +Do not disable space checks for normal production use. Use `disable_space_checks = true` only temporarily and only when another capacity check is already in place. -Do not disable space checks for normal production use. `disable_space_checks = true` should be used only temporarily and only when you have another capacity check in place. +### Backup Directory Checks -## Backup Failed +| Error name | Error text contains | What to do | +| --- | --- | --- | +| Backup directory does not have enough free space | `backup failed: insufficient disk space` | Free space on the filesystem used by `backup_dir`, or move `backup_dir` to a larger volume. Keep `backup_space_buffer` high enough for your safety margin. | +| Backup directory cannot be checked | `backup failed: failed to check disk space` | Create `backup_dir` before retrying, confirm it is on a mounted filesystem, and make sure the agent process can access it. | +| Backup directory cannot be created | `backup failed: failed to create backup directory` | Create the directory manually, fix ownership and permissions, and retry. | +| Backup size cannot be estimated | `backup failed: failed to estimate backup size` | Check that the target table or database still exists and that the agent MySQL user can read metadata from `information_schema.TABLES`. | + +### Pre-change Backup Tools -| Error in Releem | What it means | What to do | +| Error name | Error text contains | What to do | | --- | --- | --- | -| `mysqldump failed` | The logical backup failed before the schema change. | Install `mysqldump`, set `mysqldump_path` if needed, and confirm that the agent database user can read the target table. | -| `xtrabackup backup failed` or `xtrabackup prepare failed` | The physical backup failed before the schema change. | Install a compatible `xtrabackup` version for MySQL or use `mariabackup` for MariaDB. Set `xtrabackup_path` to the correct binary. | -| `failed to estimate backup size` | The agent could not estimate the table backup size. | Check that the database and table still exist and that the agent database user can read their metadata. | +| Backup connection settings are incomplete | `backup failed: mysql_host is required for backup` or `backup failed: config is required for backup` | Check the MySQL connection settings in `releem.conf`: `mysql_host`, `mysql_port`, `mysql_user`, and `mysql_password`. Restart the agent after changing the file. | +| Logical backup with mysqldump failed | `backup failed: mysqldump failed` | Install `mysqldump`, set `mysqldump_path` if the binary is not on `PATH`, and confirm that the agent MySQL user can dump the target table. | +| Physical backup with xtrabackup failed | `backup failed: xtrabackup backup failed` | Install a compatible Percona XtraBackup for MySQL, or use `mariabackup` for MariaDB by setting `xtrabackup_path = "mariabackup"`. Check the tool output in the agent logs. | +| Physical backup prepare step failed | `backup failed: xtrabackup prepare failed` | Check that the backup tool version matches the server type and version. Review the detailed tool output in the agent logs, fix the backup tool issue, and retry. | +| Unsupported backup method was selected | `backup failed: unsupported backup method` | Contact Releem support with the task id and agent logs. This indicates an internal task or agent mismatch. | -After fixing the tool or permission issue, restart the agent if you changed `releem.conf`. +### Native Online DDL -## Online DDL Failed +These errors are usually shown as `Schema changes execution failed: schema change execution failed: ...`. -| Error in Releem | What it means | What to do | +| Error name | Error text contains | What to do | | --- | --- | --- | -| `online DDL preflight failed on test table` | The agent tested the change in `online_ddl_test_schema`, and the test failed. | Check the SQL error in the task output. Confirm that the test schema exists and that the agent user has `CREATE`, `DROP`, `INDEX`, and `ALTER` permissions on it. | -| `test schema is required`, `failed to create test schema`, or `failed to create test table` | The agent could not create the temporary schema or table used for the preflight. | Create the schema from [Automatic Schema Changes](/query-optimization/automatic-schema-changes) and grant the required permissions. | -| `lock wait timeout`, `metadata lock`, or `failed to set session lock_wait_timeout` | Another transaction or session is blocking the schema change. | Clear the blocking transaction and retry, or apply the change during a quieter period. | -| `schema change execution failed` after the preflight passed | The test succeeded, but the production change failed. | Check the full task output and agent logs. If the cause is not clear, contact Releem support with the task id and logs. | +| Online DDL test schema is not configured | `test schema is required for online DDL preflight` | Set `online_ddl_test_schema = "releem_online_ddl_test"` or another schema name in `releem.conf`, grant the agent user access to it, restart the agent, and retry. | +| Online DDL test schema cannot be created | `failed to create test schema` | Grant the agent MySQL user permission to create the configured test schema, or create the schema manually and grant access. | +| Online DDL test table cannot be created | `failed to create test table` | Grant the agent user permission to create tables in `online_ddl_test_schema`. Also check that the source table still exists. | +| Online DDL preflight failed | `online DDL preflight failed on test table` | The agent tested the DDL on an empty copy of the table and MySQL rejected it. Check the MySQL error after this message. Fix unsupported DDL, incompatible clauses, or missing privileges before retrying. | +| DDL format is not supported for online execution | `empty SQL statement`, `unsupported DDL for online clauses`, `failed to prepare test DDL SQL`, or `could not locate target table in DDL statement` | Use a supported `ALTER TABLE` or `CREATE INDEX` statement that clearly targets the analyzed table. Recreate the recommendation if the DDL text no longer matches the table. | +| Online DDL lock timeout cannot be set | `failed to set session lock_wait_timeout` | Grant the needed session-variable permission if required by your server, or check MySQL restrictions that prevent setting session variables. | +| Online DDL is blocked or rejected on the production table | `schema change execution failed` with a MySQL error about locks, metadata locks, `ALGORITHM=INPLACE`, `LOCK=NONE`, or another server error | Clear blocking transactions, retry during a quieter period, or apply the change manually during a maintenance window. If the error says `Try ALGORITHM=COPY`, Releem can fall back to `pt-online-schema-change` only when that method was allowed for the statement. | + +### pt-online-schema-change -## pt-online-schema-change Failed +These errors are usually shown as `Schema changes execution failed: pt-online-schema-change execution failed: ...`. -| Error in Releem | What it means | What to do | +| Error name | Error text contains | What to do | | --- | --- | --- | -| `pt-online-schema-change dry-run failed` | The tool was available, but its dry run failed. | Install or update Percona Toolkit, set `ptosc_path`, and check the permissions required by `pt-online-schema-change`. | -| `pt-online-schema-change failed` | The dry run passed, but the actual execution failed. | Check the agent logs for the tool output. Common causes include missing privileges, trigger conflicts, replication lag, disk space limits, or table changes made after the recommendation was generated. | +| pt-online-schema-change connection settings are incomplete | `mysql_host is required for pt-online-schema-change` or `config is required for pt-online-schema-change` | Check `mysql_host`, `mysql_port`, `mysql_user`, and `mysql_password` in `releem.conf`. Restart the agent after changing the file. | +| pt-online-schema-change cannot parse the table name | `pt-online-schema-change execution failed: failed to parse table name` | Confirm that `analysis_results.schema_name` and `analysis_results.table_name` match an existing table and use a valid `schema.table` form. | +| pt-online-schema-change dry run failed | `pt-online-schema-change dry-run failed` | Install or update Percona Toolkit, set `ptosc_path` if needed, and grant the permissions required by `pt-online-schema-change`. Run a manual dry run with the same connection settings if you need the full tool output. | +| pt-online-schema-change execution failed | `pt-online-schema-change failed` | The dry run passed, but the actual execution failed. Check the tool output in agent logs. Common causes include missing privileges, replica lag, triggers, foreign key restrictions, disk limits, or table changes after the recommendation was generated. | -`pt-online-schema-change` may require permissions such as `SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, and `TRIGGER`, depending on the server version and topology. +`pt-online-schema-change` may require privileges such as `SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, and `TRIGGER`, depending on the server version and topology. + +### Other Execution Errors + +| Error name | Error text contains | What to do | +| --- | --- | --- | +| Target table is missing from the task | `table name is required for schema change execution` | Contact Releem support with the task id. The task reached the agent without the analyzed table name. | +| Target table name cannot be parsed | `failed to parse table name` | Check that the recommendation still points to an existing `schema.table`. If the table was renamed or dropped after the recommendation was generated, recreate the recommendation. | +| No execution method was available during execution | `schema change could not be executed` | Contact Releem support with the task id and agent logs. This should normally be caught earlier by validation. | -## The Task Finished Without Applying a Change +## No Statement Was Applied -| Error in Releem | What it means | What to do | +| Error name | Error text contains | What to do | | --- | --- | --- | -| `No schema changes were executed` | The task completed without applying any statement. | Review the full task output and agent logs. Retry from Releem, or contact support if the task should have applied a change. | +| Task finished without applying a change | `No schema changes were executed` | Review the full task output and agent logs. Retry from Releem if the recommendation is still valid, or contact support if the task should have applied a change. | \ No newline at end of file From 9be8b6752f66fa14d6e43f934ce27fbfb46ac252 Mon Sep 17 00:00:00 2001 From: Dmitry Kochetov Date: Fri, 12 Jun 2026 20:11:29 +0400 Subject: [PATCH 10/10] fixed documentations --- .../automatic-schema-changes.md | 72 +++++++++---------- .../schema-change-troubleshooting.md | 12 ++-- 2 files changed, 41 insertions(+), 43 deletions(-) diff --git a/docs/query-optimization/automatic-schema-changes.md b/docs/query-optimization/automatic-schema-changes.md index d45e647..d840587 100644 --- a/docs/query-optimization/automatic-schema-changes.md +++ b/docs/query-optimization/automatic-schema-changes.md @@ -44,21 +44,20 @@ Install the tools that match your database type and backup strategy. Some packages may not be available in the default operating system repositories. If your package manager cannot find a package, add the vendor repository first and then run the installation command again: -- For Percona Toolkit and Percona XtraBackup, follow [Install percona-release](https://docs.percona.com/percona-software-repositories/installing.html). +- For `pt-online-schema-change` and Percona XtraBackup, follow [Install percona-release](https://docs.percona.com/percona-software-repositories/installing.html). - For MariaDB Backup, follow [MariaDB Package Repository Setup and Usage](https://mariadb.com/docs/server/server-management/install-and-upgrade-mariadb/installing-mariadb/binary-packages/mariadb-package-repository-setup-and-usage). -- For MySQL client packages, follow the MySQL repository guide for your platform: [MySQL APT Repository](https://dev.mysql.com/doc/mysql-apt-repo-quick-guide/en/) or [MySQL Yum Repository](https://dev.mysql.com/doc/mysql-yum-repo-quick-guide/en/). For Debian or Ubuntu: ```bash sudo apt-get update -sudo apt-get install default-mysql-client percona-toolkit +sudo apt-get install pt-online-schema-change ``` For RHEL, CentOS, Rocky Linux, or AlmaLinux: ```bash -sudo yum install mysql percona-toolkit +sudo yum install pt-online-schema-change ``` On newer releases such as Rocky Linux 8+ or AlmaLinux 8+, use `dnf` instead of `yum`. @@ -149,7 +148,7 @@ sudo systemctl restart mysqld sudo systemctl restart mariadb ``` -For managed database services, configure the equivalent database parameter or backup/binlog retention setting in the provider console. Releem must see `log_bin = ON` and retention of at least 2 days in the variables collected by the agent. +For managed database services, configure the equivalent database parameter or backup/binlog retention setting in the provider console. Releem must see `log_bin=ON` and retention of at least 2 days in the variables collected by the agent. Verify the effective values: @@ -164,12 +163,6 @@ WHERE Variable_name IN ( ); ``` -After changing MySQL settings, let the Releem Agent collect the next snapshot or run: - -```bash -sudo /opt/releem/releem-agent -f -``` - #### Check table and execution requirements Releem checks each approved statement before sending it to the agent: @@ -184,17 +177,17 @@ You can check the target table before approving a change: ```sql SELECT ENGINE FROM information_schema.TABLES -WHERE TABLE_SCHEMA = 'your_database' - AND TABLE_NAME = 'your_table'; +WHERE TABLE_SCHEMA='your_database' + AND TABLE_NAME='your_table'; -SHOW INDEX FROM `your_database`.`your_table` WHERE Key_name = 'PRIMARY'; +SHOW INDEX FROM `your_database`.`your_table` WHERE Key_name='PRIMARY'; SHOW TRIGGERS FROM `your_database` LIKE 'your_table'; SELECT TABLE_SCHEMA, TABLE_NAME, CONSTRAINT_NAME FROM information_schema.KEY_COLUMN_USAGE -WHERE REFERENCED_TABLE_SCHEMA = 'your_database' - AND REFERENCED_TABLE_NAME = 'your_table'; +WHERE REFERENCED_TABLE_SCHEMA='your_database' + AND REFERENCED_TABLE_NAME='your_table'; ``` #### Check disk-space requirements @@ -206,7 +199,7 @@ The agent checks disk capacity before execution: - when a backup is required, `backup_dir` must have enough free space for the estimated backup size plus `backup_space_buffer`; - logical backups use `mysqldump`; physical backups use `xtrabackup` or `mariabackup`. -Keep `disable_space_checks = false` for production use. Disable it only temporarily and only when you already have another capacity check in place. +Keep `disable_space_checks=false` for production use. Disable it only temporarily and only when you already have another capacity check in place. ### 3. Configure the Releem Agent @@ -219,29 +212,29 @@ sudo nano /opt/releem/releem.conf Enable DDL execution and review the paths used by the agent: ``` -enable_exec_ddl = true - -backup_dir = "/tmp/backups" -ptosc_path = "pt-online-schema-change" -mysqldump_path = "mysqldump" -xtrabackup_path = "xtrabackup" -backup_space_buffer = 20.0 -online_ddl_test_schema = "releem_online_ddl_test" -disable_space_checks = false +enable_exec_ddl=true + +backup_dir="/tmp/backups" +ptosc_path="pt-online-schema-change" +mysqldump_path="mysqldump" +xtrabackup_path="xtrabackup" +backup_space_buffer=20.0 +online_ddl_test_schema="releem_online_ddl_test" +disable_space_checks=false ``` Use full paths when a tool is not available on the agent process `PATH`. For example: ``` -ptosc_path = "/usr/bin/pt-online-schema-change" -mysqldump_path = "/usr/bin/mysqldump" -xtrabackup_path = "/usr/bin/xtrabackup" +ptosc_path="/usr/bin/pt-online-schema-change" +mysqldump_path="/usr/bin/mysqldump" +xtrabackup_path="/usr/bin/xtrabackup" ``` For MariaDB Backup, set: ``` -xtrabackup_path = "mariabackup" +xtrabackup_path="mariabackup" ``` ### 4. Prepare the Backup Directory @@ -300,7 +293,15 @@ If your server uses the legacy service command: sudo service releem-agent restart ``` -### 7. Approve the Schema Change in Releem +### 7. Collect the Next Snapshot the Releem Agent + +After changing MySQL and Releem Agent settings, let the Releem Agent collect the next snapshot or run: + +```bash +sudo /opt/releem/releem-agent -f +``` + +### 8. Approve the Schema Change in Releem Open the Releem Dashboard, review the query recommendation, and approve the change only when you are ready for the agent to apply it. @@ -312,7 +313,7 @@ After the task starts, Releem shows the result in the dashboard. If the task fai | --- | --- | --- | | `enable_exec_ddl` | `false` | Enables automatic execution of approved schema changes. Keep it `false` when you want Releem to recommend changes only. | | `backup_dir` | `/tmp/backups` | Directory where the agent stores logical and physical backups before a schema change. | -| `ptosc_path` | `pt-online-schema-change` | Path to `pt-online-schema-change` from Percona Toolkit. Used when Releem selects that method. | +| `ptosc_path` | `pt-online-schema-change` | Path to the `pt-online-schema-change` binary. Used when Releem selects that method. | | `mysqldump_path` | `mysqldump` | Path to `mysqldump`. Used for logical table backups. | | `xtrabackup_path` | `xtrabackup` | Path to `xtrabackup` or `mariabackup`. Used for physical backups when Releem selects that method. | | `backup_space_buffer` | `20.0` | Extra free-space percentage required above the estimated backup size. | @@ -321,8 +322,8 @@ After the task starts, Releem shows the result in the dashboard. If the task fai ## Additional Tool Installation Notes -- `mysqldump` is usually included in MySQL client packages. -- `pt-online-schema-change` is included in Percona Toolkit. +- `mysqldump` is needed only when Releem selects logical backup. Check that it already exists on the server or set `mysqldump_path` to the installed binary. +- `pt-online-schema-change` is required only when Releem selects this execution method. - `xtrabackup` is installed from Percona XtraBackup packages. - `mariabackup` is installed from MariaDB Backup packages and should be used for MariaDB servers. @@ -339,14 +340,11 @@ Use the returned paths in `releem.conf` if needed. ## Documentation Links -- [Percona Toolkit installation](https://docs.percona.com/percona-toolkit/installation.html) - [Percona repository setup](https://docs.percona.com/percona-software-repositories/installing.html) - [pt-online-schema-change documentation](https://docs.percona.com/percona-toolkit/pt-online-schema-change.html) - [Percona XtraBackup installation](https://docs.percona.com/percona-xtrabackup/8.0/installation.html) - [MariaDB repository setup](https://mariadb.com/docs/server/server-management/install-and-upgrade-mariadb/installing-mariadb/binary-packages/mariadb-package-repository-setup-and-usage) - [MariaDB Backup documentation](https://mariadb.com/docs/server/server-usage/backup-and-restore/mariadb-backup/mariadb-backup-overview) -- [MySQL APT Repository guide](https://dev.mysql.com/doc/mysql-apt-repo-quick-guide/en/) -- [MySQL Yum Repository guide](https://dev.mysql.com/doc/mysql-yum-repo-quick-guide/en/) - [MySQL binary logging options](https://dev.mysql.com/doc/refman/8.4/en/replication-options-binary-log.html) - [MariaDB binary log activation](https://mariadb.com/docs/server/server-management/server-monitoring-logs/binary-log/activating-the-binary-log) - [MariaDB binary log variables](https://mariadb.com/docs/server/ha-and-performance/standard-replication/replication-and-binary-log-system-variables) diff --git a/docs/query-optimization/schema-change-troubleshooting.md b/docs/query-optimization/schema-change-troubleshooting.md index e0914c5..30bc670 100644 --- a/docs/query-optimization/schema-change-troubleshooting.md +++ b/docs/query-optimization/schema-change-troubleshooting.md @@ -23,7 +23,7 @@ These errors happen before the agent runs a backup or DDL statement. | Error name | Error text contains | What to do | | --- | --- | --- | -| Automatic schema changes are disabled | `schema change execution is disabled by config` | Set `enable_exec_ddl = true` in `/opt/releem/releem.conf`, restart the Releem Agent, and retry the task from Releem. | +| Automatic schema changes are disabled | `schema change execution is disabled by config` | Set `enable_exec_ddl=true` in `/opt/releem/releem.conf`, restart the Releem Agent, and retry the task from Releem. | | Task data is invalid | `taskdetails JSON`, `schema_name is required`, `ddl_statement is required`, `analysis_results.schema_name is required`, or `analysis_results.table_name is required` | Contact Releem support with the task id. The task payload sent to the agent is incomplete or malformed and cannot be fixed only on the database server. | | No schema change was sent | `Invalid task_details: empty schema change list` | Retry from Releem. If the recommendation should contain a change, contact Releem support with the task id. | | SQL statement is invalid | `Statement N skipped: syntax validation failed` | Do not retry the same task. Fix or recreate the schema recommendation so the DDL is valid for the target MySQL or MariaDB version. | @@ -52,7 +52,7 @@ Use the text after `Statement N failed:` or after the dashboard prefix to identi | Agent cannot read the MySQL data directory | `failed to resolve datadir`, `datadir is empty`, `failed to check datadir filesystem capacity`, or `invalid datadir filesystem size` | Check that MySQL returns `SHOW VARIABLES LIKE 'datadir'`, that the path exists on the host where the agent runs, and that the agent can read filesystem capacity for that path. | | Agent cannot estimate the target table size | `failed to get table size` | Check that the table still exists and that the agent MySQL user can read `information_schema.TABLES` for the target schema and table. | -Do not disable space checks for normal production use. Use `disable_space_checks = true` only temporarily and only when another capacity check is already in place. +Do not disable space checks for normal production use. Use `disable_space_checks=true` only temporarily and only when another capacity check is already in place. ### Backup Directory Checks @@ -69,7 +69,7 @@ Do not disable space checks for normal production use. Use `disable_space_checks | --- | --- | --- | | Backup connection settings are incomplete | `backup failed: mysql_host is required for backup` or `backup failed: config is required for backup` | Check the MySQL connection settings in `releem.conf`: `mysql_host`, `mysql_port`, `mysql_user`, and `mysql_password`. Restart the agent after changing the file. | | Logical backup with mysqldump failed | `backup failed: mysqldump failed` | Install `mysqldump`, set `mysqldump_path` if the binary is not on `PATH`, and confirm that the agent MySQL user can dump the target table. | -| Physical backup with xtrabackup failed | `backup failed: xtrabackup backup failed` | Install a compatible Percona XtraBackup for MySQL, or use `mariabackup` for MariaDB by setting `xtrabackup_path = "mariabackup"`. Check the tool output in the agent logs. | +| Physical backup with xtrabackup failed | `backup failed: xtrabackup backup failed` | Install a compatible Percona XtraBackup for MySQL, or use `mariabackup` for MariaDB by setting `xtrabackup_path="mariabackup"`. Check the tool output in the agent logs. | | Physical backup prepare step failed | `backup failed: xtrabackup prepare failed` | Check that the backup tool version matches the server type and version. Review the detailed tool output in the agent logs, fix the backup tool issue, and retry. | | Unsupported backup method was selected | `backup failed: unsupported backup method` | Contact Releem support with the task id and agent logs. This indicates an internal task or agent mismatch. | @@ -79,7 +79,7 @@ These errors are usually shown as `Schema changes execution failed: schema chang | Error name | Error text contains | What to do | | --- | --- | --- | -| Online DDL test schema is not configured | `test schema is required for online DDL preflight` | Set `online_ddl_test_schema = "releem_online_ddl_test"` or another schema name in `releem.conf`, grant the agent user access to it, restart the agent, and retry. | +| Online DDL test schema is not configured | `test schema is required for online DDL preflight` | Set `online_ddl_test_schema="releem_online_ddl_test"` or another schema name in `releem.conf`, grant the agent user access to it, restart the agent, and retry. | | Online DDL test schema cannot be created | `failed to create test schema` | Grant the agent MySQL user permission to create the configured test schema, or create the schema manually and grant access. | | Online DDL test table cannot be created | `failed to create test table` | Grant the agent user permission to create tables in `online_ddl_test_schema`. Also check that the source table still exists. | | Online DDL preflight failed | `online DDL preflight failed on test table` | The agent tested the DDL on an empty copy of the table and MySQL rejected it. Check the MySQL error after this message. Fix unsupported DDL, incompatible clauses, or missing privileges before retrying. | @@ -95,7 +95,7 @@ These errors are usually shown as `Schema changes execution failed: pt-online-sc | --- | --- | --- | | pt-online-schema-change connection settings are incomplete | `mysql_host is required for pt-online-schema-change` or `config is required for pt-online-schema-change` | Check `mysql_host`, `mysql_port`, `mysql_user`, and `mysql_password` in `releem.conf`. Restart the agent after changing the file. | | pt-online-schema-change cannot parse the table name | `pt-online-schema-change execution failed: failed to parse table name` | Confirm that `analysis_results.schema_name` and `analysis_results.table_name` match an existing table and use a valid `schema.table` form. | -| pt-online-schema-change dry run failed | `pt-online-schema-change dry-run failed` | Install or update Percona Toolkit, set `ptosc_path` if needed, and grant the permissions required by `pt-online-schema-change`. Run a manual dry run with the same connection settings if you need the full tool output. | +| pt-online-schema-change dry run failed | `pt-online-schema-change dry-run failed` | Install or update `pt-online-schema-change`, set `ptosc_path` if needed, and grant the permissions required by `pt-online-schema-change`. Run a manual dry run with the same connection settings if you need the full tool output. | | pt-online-schema-change execution failed | `pt-online-schema-change failed` | The dry run passed, but the actual execution failed. Check the tool output in agent logs. Common causes include missing privileges, replica lag, triggers, foreign key restrictions, disk limits, or table changes after the recommendation was generated. | `pt-online-schema-change` may require privileges such as `SELECT`, `INSERT`, `DROP`, `RELOAD`, `SUPER`, `SHOW VIEW`, and `TRIGGER`, depending on the server version and topology. @@ -112,4 +112,4 @@ These errors are usually shown as `Schema changes execution failed: pt-online-sc | Error name | Error text contains | What to do | | --- | --- | --- | -| Task finished without applying a change | `No schema changes were executed` | Review the full task output and agent logs. Retry from Releem if the recommendation is still valid, or contact support if the task should have applied a change. | \ No newline at end of file +| Task finished without applying a change | `No schema changes were executed` | Review the full task output and agent logs. Retry from Releem if the recommendation is still valid, or contact support if the task should have applied a change. |