Preventing Out-of-memory errors in Lifecycle Query Engine

Lifecycle Query Engine (LQE) automatically stops rogue queries to prevent Out of Memory exceptions.

When an LQE system is constantly running queries, or a running query doesn’t stop, and indexing continues, indexing results are blocked and get stacked up in the heap. If Write-transactions are held up by queries and don't complete, the system might reach the maximum amount of heap space, which can cause OutOfMemoryError exceptions. As an LQE administrator, you should monitor query and indexing activity for evidence of rogue queries, long-running queries, incessant query and indexing activity, and blocked pending writebacks. If writebacks, suspensions, and JVM heap garbage collection jobs complete, there are no OutOfMemoryError or a StackOverFlowError exceptions from the JVM heap memory.

Problem

OutOfMemoryError exceptions are caused by a backlog of transactional journal writebacks from LQE. This problem usually happens when the indexing activity is ongoing due to a rogue query or an incessant query load. Depending on conditions, you might see StackOverflowError exceptions instead.

The Apache Jena TDB logs all the changes from indexing activities. The log is written to the index at an optimal moment. This process allows Read-transactions without locking the index. There are multiple Read-transactions and one active Write-transaction.

A rogue query is one that never ends, even with timeout mechanisms in place. When rogue queries occur or query requests are incessant, the transaction write journals might get backlogged on the Java™ heap. This transaction backlog can cause OutOfMemoryError or StackOverFlowError exceptions because the Write-transactions can't be performed. Also, it is possible that LQE might have an overabundance of queries and indexing activities, which can prevent garbage collection of the Java heap. The heap space gets overwhelmed, preventing active tasks from freeing heap memory.

Solution

As the administrator, you can monitor LQE and if these conditions are present, determine whether query requests can be paused for a short time so that the journal writebacks can be written to the index. If it is not possible to pause query requests, you can modify some properties that can help curtail these JVM heap space issues.

All queries should end normally or by timing out; however, in Apache Jena, some queries might become rogue. Lifecycle Query Engine automatically cancels rogue queries. It is not necessary to restart the LQE server.

For detailed information about how to monitor indexing activity and modifying properties, see the following topics:

Feedback