How To Check (And Fix) Indexes For Elasticsearch/Logstash

Posted by Joe Julian 3 years, 6 months ago (comments)

I had a computer with some bad ram that created a corrupt index in Elasticsearch. After trying all weekend and half of Monday to figure out how to get Elasticsearch running again with no response on the IRC channel, I eventually figured it out with the help of some obscure email references to references.

Elasticsearch was failing to boot with no errors, critical or not, just some warnings:

[2013-08-23 21:31:32,527][WARN ][index.shard.service      ] [Iron Fist] [logstash-2013.08.24][3] Failed to perform scheduled engine refresh
[2013-08-23 21:43:59,264][WARN ][index.merge.scheduler    ] [Iron Fist] [logstash-2013.08.24][3] failed to merge
[2013-08-23 21:43:59,267][WARN ][index.engine.robin       ] [Iron Fist] [logstash-2013.08.24][3] failed engine
[2013-08-23 21:43:59,340][WARN ][cluster.action.shard     ] [Iron Fist] sending failed shard for [logstash-2013.08.24][3], node[TFt4zNl4QjWO9dyDhwnwkA], [P], s[STARTED], reason [engine failure, message [MergeException[java.lang.RuntimeException: Invalid vInt detected (too many bits)]; nested: RuntimeException[Invalid vInt detected (too many bits)]; ]]
[2013-08-23 21:43:59,340][WARN ][cluster.action.shard     ] [Iron Fist] received shard failed for [logstash-2013.08.24][3], node[TFt4zNl4QjWO9dyDhwnwkA], [P], s[STARTED], reason [engine failure, message [MergeException[java.lang.RuntimeException: Invalid vInt detected (too many bits)]; nested: RuntimeException[Invalid vInt detected (too many bits)]; ]]
[2013-08-24 04:47:10,230][WARN ][index.merge.scheduler    ] [Iron Fist] [logstash-2013.08.24][2] failed to merge
[2013-08-24 04:47:10,236][WARN ][index.engine.robin       ] [Iron Fist] [logstash-2013.08.24][2] failed engine
[2013-08-24 04:47:10,302][WARN ][cluster.action.shard     ] [Iron Fist] sending failed shard for [logstash-2013.08.24][2], node[TFt4zNl4QjWO9dyDhwnwkA], [P], s[STARTED], reason [engine failure, message [MergeException[java.lang.RuntimeException: Invalid vLong detected (negative values disallowed)]; nested: RuntimeException[Invalid vLong detected (negative values disallowed)]; ]]
[2013-08-24 04:47:10,302][WARN ][cluster.action.shard     ] [Iron Fist] received shard failed for [logstash-2013.08.24][2], node[TFt4zNl4QjWO9dyDhwnwkA], [P], s[STARTED], reason [engine failure, message [MergeException[java.lang.RuntimeException: Invalid vLong detected (negative values disallowed)]; nested: RuntimeException[Invalid vLong detected (negative values disallowed)]; ]]

The problem was actually in shard 0's index which was never mentioned. The solution was to check the indexes using (based on the install locations for the RPM provided from http://www.elasticsearch.org/download/):

ES_HOME=/usr/share/elasticsearch
ES_CLASSPATH=$ES_CLASSPATH:$ES_HOME/lib/elasticsearch-0.90.3.jar:$ES_HOME/lib/*:$ES_HOME/lib/sigar/*
INDEXPATH=/data/logstash/data/elasticsearch/nodes/0/indices/logstash-2013.08.24/0/index/
sudo -u logstash java -cp $ES_CLASSPATH -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex $INDEXPATH

Once the problem index was identified, if the solution that it suggests is acceptible, run the command again with the "-fix" switch.