Migration fails from 4.0.2-1 to 4.1 #2216

marpoe · 2021-10-14T07:20:17Z

Request Type

Bug

Work Environment

Question	Answer
OS version (server)	RedHat
OS version (client)	10
TheHive version	4.1.10
Package Type	RPM
Database	Cassandra
Index type	Lucene
Attachments storage	Local
Browser type & version	Edge

Problem Description

The reindexation step fails when upgrading from TheHive 4.0.2-1.
In my opinion, it is similar to the closed issue #1861. The only difference is, that our problem is related to the size of the "data" field (see log file below). We are using this field for saving SIEM data within our "SIEM <> TheHive" integration, for more details to our workflow and use case, please see the explanation below.

TheHive application.log

2021-10-11 12:05:57,917 [ERROR] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScannerExecutor in Thread-66 - Unexpected error processing data: {}
java.lang.IllegalArgumentException: Document contains at least one immense term in field="data" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[104, 116, 116, 112, 58, 47, 47, 122, 99, 114, 109, 115, 116, 97, 116, 105, 99, 45, 97, 46, 97, 107, 97, 109, 97, 105, 104, 100, 46, 110]...', original message: bytes can be at most 32766 in length; got 77530
at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:853)
at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:430)
at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:394)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:251)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:494)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1616)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1608)
at org.janusgraph.diskstorage.lucene.LuceneIndex.restore(LuceneIndex.java:305)
at org.janusgraph.diskstorage.indexing.IndexTransaction.restore(IndexTransaction.java:128)
at org.janusgraph.graphdb.olap.job.IndexRepairJob.workerIterationEnd(IndexRepairJob.java:201)
at org.janusgraph.graphdb.olap.VertexJobConverter.workerIterationEnd(VertexJobConverter.java:118)
at org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScannerExecutor$Processor.run(StandardScannerExecutor.java:285)
Caused by: org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes can be at most 32766 in length; got 77530
at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:265)
at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:151)
at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:843)
... 11 common frames omitted

Some points to our Workflow

Within our SIEM <> TheHive integration, we are creating TheHive-Alerts based on triggered SIEM Alerts. Here we are mapping the SIEM fields to Observables - e.g. the raw event field from the SIEM. This raw field includes all event data and helps the analyst to start further researching without going back to the SIEM. In some rare cases, this field is reaching a high amount of data....and as a consequence, also the observable. Another reason for us is, that we want to have everything documented to the alert/case.

Possible Solutions

Adjust indexing process for the field "data".

Complementary information

We have a TheHive instance with around 8.500 Alerts and 3.100 Cases. With our current version of TheHive we are facing more and more performance issues. For further usage, the Update to 4.1 is a must for us.

If there are no possibilities to adjust the indexing process, we have to use a new database for TheHive 4.1, change our workflow and keep a legacy system for accessing our data. Would be very grateful if you can help me.

nadouani · 2021-10-27T19:39:25Z

Hello @marpoe We will take a look and check the best possible solution to solve this issue.

marpoe added bug TheHive4 TheHive4 related issues labels Oct 14, 2021

nadouani assigned To-om Oct 27, 2021

nadouani added the core label Oct 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migration fails from 4.0.2-1 to 4.1 #2216

Migration fails from 4.0.2-1 to 4.1 #2216

marpoe commented Oct 14, 2021

nadouani commented Oct 27, 2021

Migration fails from 4.0.2-1 to 4.1 #2216

Migration fails from 4.0.2-1 to 4.1 #2216

Comments

marpoe commented Oct 14, 2021

Request Type

Work Environment

Problem Description

TheHive application.log

Some points to our Workflow

Possible Solutions

Complementary information

nadouani commented Oct 27, 2021