Elasticsearch: search_context_missing_exception - No search context found for id

Consolidating daily logstash indexes into monthly logstash indexes sometimes results in the following error:

        "index": "logstash-2018-11-01",
        "shard": 0,
        "node": "aodScdJuQ5OWLyucQ6Px5Q",
        "reason": {
          "type": "search_context_missing_exception",
          "reason": "No search context found for id [3708723]"

This error is typically caused by the following:

Reindexing uses the scroll api under the covers to read a “point-in-time” view of the source data.

This point in time consists of a set of segments (Lucene files) that are essentially locked and prevented from being deleted by the usual segment merging process that works in the background to reorganise the index in response to ongoing CUD (create/update/delete) operations.

It is costly to preserve this view of data which is why users of the scroll API must renew their lock with each new request for a page of results. Locks will timeout if the client fails to return within the timespan which they said they would return.
The error you are seeing is because the reindex function has requested another page of results but the scroll ID which represents a lock on a set of files is either:

* timed out (i.e the reindex client spent too long indexing the previous page) or
* lost because the node serving the scroll api was restarted or otherwise became unavailable

(Source: https://discuss.elastic.co/t/problem-when-reindexing-large-index/117421)

Looking at the Elasticsearch logs I can see:

[INFO ][o.e.t.LoggingTaskListener] 3406201 finished with response BulkByScrollResponse[took=2.1h,timed_out=false,sliceId=null,updated=0,created=37036000,deleted=0,batches=37036,versionConflicts=0,noops=0,retries=0,throttledUntil=0s,bulk_failures=[],search_failures=[{"shard":-1,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [1102015]"}}, {"shard":-1,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [1102016]"}}, {"shard":-1,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [1102023]"}}]]
[INFO ][o.e.m.j.JvmGcMonitorService] [server-ls2.local] [gc][40994] overhead, spent [335ms] collecting in the last [1s]

[WARN ][o.e.t.TransportService   ] [server-ls2.local] Received response for a request that has timed out, sent [51770ms] ago, timed out [21769ms] ago, action [internal:discovery/zen/fd/master_ping], node [{server-ls5.local}{aM_wxa2mTY2XI9P7bsobSg}{gsNCWS-GQ0md1u-yuhxSNw}{}{}{site_id=rack1, ml.machine_memory=67279155200, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], id [11840408]

So there was a timeout issue - but the cause is unknown at this time….