Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Set **Enable Elasticsearch for search queries** to ``true``, and setting **Enabl

.. warning::

For high post volume deployments, we also strongly recommend *disabling* Database Search once Elasticsearch or AWS OpenSearch is fully configured and running. The Mattermost Server will fall back on Database search if ElasticSearch or OpenSearch are unavailable which can lead to performance degradation on high post volume deployments.
For high post volume deployments, we also strongly recommend *disabling* Database Search once Elasticsearch or AWS OpenSearch is fully configured and running. The Mattermost Server will fall back on Database search if Elasticsearch or OpenSearch are unavailable, which can lead to performance degradation on high post volume deployments. From Mattermost v11.7, the server proactively detects outages through periodic health checks and falls back to database search on the first health check failure, rather than waiting for requests to time out. See the :ref:`outage handling FAQ <administration-guide/scale/enterprise-search:how does mattermost handle elasticsearch or opensearch outages?>` for details.

Once the configuration is saved, new posts made to the database are automatically indexed on the Elasticsearch or AWS OpenSearch server.

Expand Down
43 changes: 43 additions & 0 deletions source/administration-guide/scale/enterprise-search.rst
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,49 @@ From Mattermost v11, :doc:`Support Packet generation </administration-guide/mana

The enterprise search connection test results appear in the Support Packet and can help identify configuration issues such as network connectivity problems, authentication failures, or server availability issues. If connection errors are present, they will be clearly documented with specific error messages to aid in troubleshooting.

How does Mattermost handle Elasticsearch or OpenSearch outages?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From Mattermost v11.7, the server includes an automatic health monitor for Elasticsearch and OpenSearch connections. The health monitor runs periodic health checks and automatically manages the connection lifecycle:

- **Health checks**: The server checks the health of the search engine cluster every 60 seconds. After 3 consecutive health check failures, the engine is stopped and search falls back to the database.
- **Fast-fail on first failure**: On the very first health check failure, the engine is immediately marked as unhealthy and search requests are routed to the database. This happens before the consecutive failure threshold is reached, so users experience minimal disruption.
- **Automatic retry**: When the search engine is unavailable, the server retries connecting with exponential backoff, starting at 15 seconds and doubling up to a maximum of 5 minutes between attempts.
- **Automatic recovery**: When the search engine becomes available again, the server automatically reconnects and resumes using it for search queries. No manual intervention or server restart is required.
- **Configuration changes**: Changes to Elasticsearch or OpenSearch configuration settings, or license changes, immediately trigger the health monitor to re-evaluate the connection state.
- **Monitoring**: A ``mattermost_search_engine_status`` Prometheus metric reports the health of the search engine (``1`` = healthy or not configured, ``0`` = configured but unavailable). Use this metric to :doc:`set up alerts </administration-guide/scale/performance-alerting>` for search engine outages. See :doc:`performance monitoring metrics </administration-guide/scale/performance-monitoring-metrics>` for details.

During an outage, you may see the following log messages:

.. list-table::
:widths: 15 45 40
:header-rows: 1

* - Level
- Log message
- Meaning
* - Error
- ``Search engine health check failed repeatedly; stopping engine``
- The failure threshold was reached and the engine has been stopped. Search falls back to the database.
* - Warn
- ``Search engine health check failed``
- An individual health check failed. Includes a ``consecutive_failures`` count.
* - Warn
- ``Search engine health check failed: it is now marked as unhealthy``
- A previously healthy engine failed a health check and has been marked unhealthy. Search requests will fall back to the database immediately.
* - Warn
- ``Search engine watcher: Start() failed, will retry``
- A reconnection attempt failed. Includes a ``next_backoff`` field indicating the time until the next retry.
* - Info
- ``Search engine health check succeeded: it is now marked as healthy``
- The engine passed a health check after being unhealthy and is now handling search requests again.
* - Info
- ``Search engine watcher: engine started successfully``
- The engine has recovered and is active again.
* - Info
- ``Search engine watcher: engine disabled, parking``
- The health monitor is idle because the search engine is disabled in configuration.

My search indexes won't complete, what should I do?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,7 @@ Search metrics
- ``mattermost_search_post_index_total``: The total number of posts indexes carried out.
- ``mattermost_search_posts_searches_total``: The total number of post searches carried out.
- ``mattermost_search_user_index_total``: The total number of user indexes carried out.
- ``mattermost_search_engine_status``: Status of the configured search engine: ``1`` = healthy or not configured, ``0`` = configured but unavailable. Use this metric to set up alerts for search engine outages.

WebSocket metrics
~~~~~~~~~~~~~~~~~
Expand Down
Loading