You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Mar 8, 2024. It is now read-only.
The index engine fails to process the document if it contains a "non full-text" field with more than 32766 bytes.
During document creation, the document won't be indexed and become invisible (even if it stored in the database).
During a data reindex, the process stops and a part of the data is not indexed.
A huge data can break the application.
Solution
During database initialisation, add a process that finds immense terms and fixes them. Several strategies can be applied:
truncate: truncate the data
delete: remove the document
log: show the document in logs
A custom strategy (store the data in a file storage for example) can also be considered but it cannot be implemented in Scalligraph.
The process requires a full scan of the database (because the index cannot be used). It is triggered only if the configuration is present. The configuration consists of field name and the strategy to apply on. The strategy can contain an optional parameter which define the size threshold in characters (a character may occupy 4 bytes in UTF-8).
db.janusgraph {
immenseTermProcessing: {
data: "delete(2048)" // Delete document that contains a field "data" with size greater that 2048
title: "truncate(4096)" // Truncate the field "title"
name: "truncate" // Truncate the field "name" (default threshold is 8191)
}
}
IMPORTANT The configuration should be present only for one startup to fix the data. It should be removed as soon as the process if finished.
The text was updated successfully, but these errors were encountered:
Request Type
Bug
Problem Description
The index engine fails to process the document if it contains a "non full-text" field with more than 32766 bytes.
During document creation, the document won't be indexed and become invisible (even if it stored in the database).
During a data reindex, the process stops and a part of the data is not indexed.
A huge data can break the application.
Solution
During database initialisation, add a process that finds immense terms and fixes them. Several strategies can be applied:
truncate
: truncate the datadelete
: remove the documentlog
: show the document in logsA custom strategy (store the data in a file storage for example) can also be considered but it cannot be implemented in Scalligraph.
The process requires a full scan of the database (because the index cannot be used). It is triggered only if the configuration is present. The configuration consists of field name and the strategy to apply on. The strategy can contain an optional parameter which define the size threshold in characters (a character may occupy 4 bytes in UTF-8).
IMPORTANT The configuration should be present only for one startup to fix the data. It should be removed as soon as the process if finished.
The text was updated successfully, but these errors were encountered: