Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 56 additions & 24 deletions doc/book/admin/troubleshoot.rst
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
.. _admin-troubleshoot:
.. _admin-troubleshooting-guide:

================================================================================
Troubleshooting guide
================================================================================
=====================

.. _admin-troubleshoot-memory-issues:

--------------------------------------------------------------------------------

Problem: INSERT/UPDATE-requests result in ER_MEMORY_ISSUE error
--------------------------------------------------------------------------------
---------------------------------------------------------------

**Possible reasons**

Expand Down Expand Up @@ -63,9 +62,8 @@ Try either of the following measures:

.. _admin-troubleshoot-cpu-load:

--------------------------------------------------------------------------------
Problem: Tarantool generates too heavy CPU load
--------------------------------------------------------------------------------
-----------------------------------------------

**Possible reasons**

Expand Down Expand Up @@ -108,9 +106,8 @@ If the load is mostly generated by INSERT/UPDATE/DELETE requests, we recommend

.. _admin-troubleshoot-query-timeout:

--------------------------------------------------------------------------------
Problem: Query processing times out
--------------------------------------------------------------------------------
-----------------------------------

**Possible reasons**

Expand Down Expand Up @@ -165,9 +162,8 @@ Problem: Query processing times out

.. _admin-troubleshoot-negative-lag-idle:

--------------------------------------------------------------------------------
Problem: Replication "lag" and "idle" contain negative values
--------------------------------------------------------------------------------
-------------------------------------------------------------

This is about ``box.info.replication.(upstream.)lag`` and
``box.info.replication.(upstream.)idle`` values in
Expand All @@ -189,9 +185,8 @@ the local instance’s clock.

.. _admin-troubleshoot-idle-grows-no-logs:

--------------------------------------------------------------------------------
Problem: Replication "idle" keeps growing, but no related log messages appear
--------------------------------------------------------------------------------
-----------------------------------------------------------------------------

This is about ``box.info.replication.(upstream.)idle`` value in
:doc:`/reference/reference_lua/box_info/replication` section.
Expand All @@ -211,9 +206,8 @@ the same replica UUID'``.

.. _admin-troubleshoot-mr-odd-replication-stats:

--------------------------------------------------------------------------------
Problem: Replication statistics differ on replicas within a replica set
--------------------------------------------------------------------------------
-----------------------------------------------------------------------

This is about a replica set that consists of one master and several replicas.
In a replica set of this type, values in
Expand All @@ -231,9 +225,8 @@ Replication is broken.

.. _admin-troubleshoot-mm-replication-stopped:

--------------------------------------------------------------------------------
Problem: Master-master replication is stopped
--------------------------------------------------------------------------------
---------------------------------------------

This is about
:doc:`box.info.replication(.upstream).status </reference/reference_lua/box_info/replication>`
Expand Down Expand Up @@ -268,9 +261,8 @@ We also recommend using text primary keys or setting up

.. _admin-troubleshoot-slow-tarantool:

--------------------------------------------------------------------------------
Problem: Tarantool works much slower than before
--------------------------------------------------------------------------------
------------------------------------------------

**Possible reasons**

Expand Down Expand Up @@ -308,15 +300,56 @@ recommend to optimize your Tarantool application code).
If the value is greater than 0.01, your application definitely needs thorough
code analysis aimed at optimizing memory usage.

.. _admin-troubleshoot-auth-delay:

Problem: Adding a new replica set to a cluster results in ER_AUTH_DELAY error
-----------------------------------------------------------------------------

There are instances in the cluster that are unable to connect to another node in the replica set due to exceeding
the number of authorization attempts.
On these instances, the ``Too many authentication attempts`` error is raised.

**Possible reasons**

1. Incorrect authentication credentials

**Solution**

In the cluster configuration, verify that the credentials the node is attempting to connect with are correct.
To do this, check the :ref:`replication <cfg_replication-replication>` parameter.

2. Network issues

**Solution**

If you encounter network issues, restart the instance or re-add the replica set to the cluster.

3. Tarantool instances are running on matching addresses

**Solution**

Identify the instance that other nodes in the replica set are unable to connect to.
Check the number of failed authorization attempts on the instance that was unable to connect to.

.. code-block:: lua

box.stat().AUTH

If the number of failed attempts is increasing every second, check the list of nodes that are trying to authorize on this replica.
An increasing number of attempts may indicate there are some other Tarantool instances on the machine that have been
previously started on the same addresses.
In this case, the instance with the ``ER_AUTH_DELAY`` error and some old Tarantool nodes are both trying to
authorize on the same replica, and the first instance exceeds the authorization time limit on the replica.

To resolve the problem, stop the old Tarantool instances and restart the replication.

.. _admin-troubleshoot-finalizer_yielding:

--------------------------------------------------------------------------------
Problem: Fiber switch is forbidden in '__gc' metamethod
--------------------------------------------------------------------------------
-------------------------------------------------------

~~~~~~~~~~~~~~~~~~~~~~~~
Problem description
~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~

Fiber switch is forbidden in ``__gc`` metamethod since `this change <https://github.com/tarantool/tarantool/issues/4518#issuecomment-704259323>`_
to avoid unexpected Lua OOM.
Expand All @@ -325,9 +358,8 @@ for example, to close a socket.

Below are examples of proper implementing such a procedure.

~~~~~~~~~~~~~~~~
Solution
~~~~~~~~~~~~~~~~
~~~~~~~~

First, there come two simple examples illustrating the logic of the
solution:
Expand Down
81 changes: 81 additions & 0 deletions locale/ru/LC_MESSAGES/book/admin/troubleshoot.po
Original file line number Diff line number Diff line change
Expand Up @@ -448,6 +448,87 @@ msgstr ""
"Если значение больше 0,01, код приложения однозначно необходимо "
"проанализировать на предмет оптимизации использования памяти."

msgid ""
"Problem: Adding a new replica set to a cluster results in "
"ER_AUTH_DELAY error"
msgstr ""
"Проблема: при добавлении нового набора реплик в кластер возникает ошибка "
"ER_AUTH_DELAY error"

msgid ""
"There are instances in the cluster that are unable to connect to another node "
"in the replica set due to exceeding the number of authorization attempts. "
"On these instances, the ``Too many authentication attempts`` error is raised."
msgstr ""
"В кластере есть экземпляры, которые могут подключиться к другому узлу "
"в наборе реплик из-за превышения числа попыток авторизации. "
"На таких узлах возникает ошибка ``Too many authentication attempts``."

msgid "**Possible reasons**"
msgstr "**Возможные причины**"

msgid ""
"Incorrect authentication credentials"
msgstr ""
"Некорректные данные аутентификации"

msgid "Solution"
msgstr "Решение"

msgid ""
"In the cluster configuration, verify that the credentials the node is attempting to connect with are correct. "
"To do this, check the :ref:`replication <cfg_replication-replication>` parameter."
msgstr ""
"Убедитесь, что в конфигурации кластера корректно указаны учетные данные, которые узел использует для подключения. "
"Для этого проверьте параметр :ref:`replication <cfg_replication-replication>`."

msgid ""
"Network issues"
msgstr ""
"Проблемы с сетью"

msgid "Solution"
msgstr "Решение"

msgid ""
"If you encounter network issues, restart the instance or re-add the replica set to the cluster."
msgstr ""
"При возникновении проблем с сетью перезапустите экземпляр или повторно добавьте набор реплик в кластер."

msgid ""
"Tarantool instances are running on matching addresses"
msgstr ""
"Экземляры Tarantool запущены на совпадающих адресах"

msgid "Solution"
msgstr "Решение"

msgid ""
"Identify the instance that other nodes in the replica set are unable to connect to. "
"Check the number of failed authorization attempts on the instance that was unable to connect to."
msgstr ""
"Определите экземпляр, к которому не могут подключиться другие узлы в наборе реплик. "
"Проверьте, сколько было неудачных попыток авторизации на узле, к которому не удалось подключиться."

msgid ""
"If the number of failed attempts is increasing every second, check the list of nodes "
"that are trying to authorize on this replica. "
"An increasing number of attempts may indicate there are some other Tarantool instances on the machine "
"that have been previously started on the same addresses. "
"In this case, the instance with the ``ER_AUTH_DELAY`` error and some old Tarantool nodes "
"are both trying to authorize on the same replica, and the first instance exceeds "
"the authorization time limit on the replica."
"To resolve the problem, stop the old Tarantool instances and restart the replication."
msgstr ""
"Если количество неудачных попыток растет с каждой секундой, проверьте список узлов, "
"которые пытаются авторизоваться на этой реплике. "
"Растущее число попыток может указывать на наличие на машине других экземпляров Tarantool, "
"которые ранее были запущены на тех же адресах. "
"В этом случае узел с ошибкой ``ER_AUTH_DELAY`` и некоторые старые узлы Tarantool "
"пытаются авторизоваться на одной и той же реплике, и узел с ошибкой превышает "
"лимит времени авторизации на реплике."
"Чтобы решить возникшую проблему, остановите старые экземпляры Tarantool и перезапустите репликацию."

msgid "Problem: Fiber switch is forbidden in '__gc' metamethod"
msgstr "Проблема: Переключатель файберов запрещен в метаметоде ``__gc``"

Expand Down
Loading