Engineering Lifecycle Management Wiki - Deployment

Community information and contribution guidelines

TWiki > Deployment Web > DeploymentPlanningAndDesign > PerformanceDatasheetsAndSizingGuidelines > LQE720IndexingPerformance

Revision 26 - 2025-11-23 - 10:29:35 - SameerkumarKalaria

LQE 7.2 Indexing Performance Report

Authors: SameerkumarKalaria
Build basis: 7.2.0

Page contents

Introduction
What is Indexing?
What is Validation?
Indexing Performance Results
Resource Utilization: CPU and Disk During Reindexing
Indexing Configuration and Tuning Settings
Measuring Indexing Phase Timings (Oracle)
Indexing vs Validation
Test Environment Details
Conclusion
References

Introduction

This article provides a deep dive into the performance and behaviour of Indexing and Validation in Lifecycle Query Engine (LQE). Backed by tests on large repositories (82 million resources), the document explores how LQE handles data synchronization from TRS (Tracked Resource Set) feeds, indexing stages, database comparisons, and when to use validation.

Standard Disclaimer

The information in this document is distributed AS IS. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. Any pointers in this publication to external Web sites are provided for convenience only and do not in any manner serve as an endorsement of these Web sites. Any performance data contained in this document was determined in a controlled environment, and therefore, the results that may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environment.

Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multi-programming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.

This testing was done as a way to compare and characterize the differences in performance between different versions of the product. The results shown here should thus be looked at as a comparison of the contrasting performance between different versions, and not as an absolute benchmark of performance.

What is Indexing?

The Lifecycle Query Engine (LQE) relies on Tracked Resource Set (TRS) feeds, which are contributed by various lifecycle applications (such as RM, CCM, and QM), to build and maintain an index of link-accessible resources. This index is central to enabling cross-application reporting, traceability, and impact analysis across Engineering Lifecycle Management (ELM) tools.

To ensure data remains up to date, LQE periodically synchronizes with each TRS feed. This synchronization ensures near-real-time accuracy for reporting tools such as Jazz Reporting Service (JRS).

TRS Feeds and Synchronization

LQE uses Tracked Resource Set (TRS) feeds contributed by lifecycle applications to build its index.
The TRS feed represents a live stream of artifact changes from the source application.
LQE periodically synchronizes with each TRS feed to keep the index current and near-real-time.
This index enables reporting tools like Jazz Reporting Service (JRS) to fetch up-to-date data across the lifecycle.

Indexing (Reindex) Steps

To perform a full indexing (or reindexing) for a specific data provider:

Open the LQE Data Providers page: https://host:port/lqe/web/admin/data-sources
Locate the TRS feed (data provider) you want to index.
Click on the name of the data provider to open its details.
Click the Reindex button to start a full index refresh.
(Optional) Schedule indexing for a specific date and time
Wait for the reindexing process to complete. Duration depends on data size and backend performance.

Important Notes:

Reindexing fully clears the existing index for the provider and rebuilds it from scratch using the TRS feed.
This operation can be resource-intensive and may take several hours or days, especially for large datasets.
During indexing, if a resource is particularly large, the fetch operation for that item may take longer than usual. In such cases, the UI may appear stalled or stuck at the same count, but indexing is still progressing in the background. The status will update once the large resource is fully processed.
To monitor ongoing progress more accurately, you can check the DNG internal metrics page: https://<dng-host>/rm/trs2utils?show=metrics — this provides live statistics on TRS GET operations during indexing.
Avoid shutting down or restarting LQE during reindexing — it may result in incomplete or corrupted indexing.
If reindexing fails due to an interruption (such as server restart), you must re-run the full reindex to restore index consistency.

What Happens During Reindexing

Reindexing is the process of clearing the current index and rebuilding it completely from the TRS feed.
This process:
- Deletes all existing data related to the provider from the LQE index
- Re-fetches all resources from the TRS feed
- Rewrites all data to the index from scratch
Reindexing is resource-intensive and time-consuming.
- Depending on repository size, it can take several hours to multiple days.
- It places a heavy load on both LQE and the data provider application.
Reindexing is generally required in the following cases:
- Initial setup of a new TRS feed
- Recovery from feed inconsistency or corruption
- When validation cannot resolve data issues

Indexing is divided into three major stages:

Stage	Description
Deletion	Removes all previous data related to the TRS feed from LQErs tables. Example query: `DELETE ... WHERE TRS_PROVIDER_ID = ? AND ROWNUM < 1000`
Fetching	Fetches resources from the TRS feed in parallel threads (recommended: 2–4).
Writing	Writes fetched data back to LQErs tables using batched commits in parallel threads (recommended: 2–4). Example query: `INSERT INTO RITRS."REQUIREMENT_RELATIONSHIP" ... ("ACCESS_CONTEXT_ID" ... ) VALUES (:1 , :2 ... )`

Recommended thread settings:

Number of threads to fetch resources when adding or reindexing this data provider: 2 to 4
Number of threads to write resources when adding or reindexing this data provider (PGS Only): 2 to 4

These values define how many threads LQE uses to fetch and write resources in parallel during indexing. They can be adjusted from the individual data provider’s configuration section in the LQE admin console.

Resource Pipeline Queues

The reindexing process in Lifecycle Query Engine (LQE) is optimized using a multi-queue pipeline. This architecture enables efficient, concurrent processing of large volumes of data during indexing.

LQE uses four internal queues to organize and control the flow of data from TRS feeds into the index. Each queue handles a distinct stage of the pipeline. Pipeline queues will be displayed during indexing as per below screenshot.

Queue	Function	Notes
Fetch	Holds TRS base members (URLs) that need to be retrieved. A fetcher thread reads a URL from this queue, makes an HTTP request, and forwards the resulting model to the Process queue.	Queue size: `60,000,000` (default)
Process	Stores fetched models temporarily for validation. A processing task validates each model and sends it to the Batch queue.	Controlled via: `TRS Processing: Process Queue Size` (Advanced Property)
Batch	Collects validated models into batches. A batcher thread adds models until a size threshold is reached (based on RDF triples), then passes the batch to the Write queue.	Controlled via: `TRS Processing: Batch Queue Size` and `TRS Processing: Batch Size for Writes` (Advanced Property)
Write	Holds model batches ready to be committed to the index. A writer thread reads batches from this queue and stores them in the LQErs tables.	Controlled via: `Write Queue Size` (Relational Store Setting)

This queue-based pipeline ensures that each phase of indexing—fetching, validation, batching, and writing—can operate independently and efficiently using separate threads. It also improves scalability and prevents any single stage from becoming a bottleneck in the indexing process.

What is Validation?

Validation is a process in Lifecycle Query Engine (LQE) that compares the current TRS feed with the existing index to detect and resolve inconsistencies. It provides a lightweight alternative to full reindexing when only a subset of the data is out of sync.

Validation can detect:

Missing resources — Present in TRS but not in the index
Extra resources — Present in the index but no longer in TRS
Stale resources — Outdated versions in the index compared to TRS

By identifying and correcting these discrepancies, validation helps maintain a clean and reliable index for reporting and traceability — without the overhead of a full reindex.

When to Use Validation

Validation is especially useful when:

A report is missing expected resources
A report includes outdated (stale) resources
Duplicate versions of the same artifact appear
An application view and a corresponding JRS report show different artifact counts
Deleted resources still appear in reports
LQE or LDX shows skipped resources during indexing

Note: Validation is only available for TRS feeds that explicitly support the feature in their application. For the following data providers, you can run the TRS validation to recover data gaps, which is faster than reindexing.

IBM® DOORS® Next
Engineering Workflow Management Tracking and Planning
Engineering Workflow Management SCM Configuration Resources
Rhapsody® Model Manager
Engineering Test Management
Global Configuration Management

How Validation Works

The validation process does not reindex the entire dataset. Instead, it performs a delta comparison between the TRS feed and the indexed data.

If differences are found:

Missing data is added to the index
Extra data is removed from the index
Stale data is updated with the latest version

Unlike indexing, validation does not begin with a full deletion phase, making it significantly faster and less disruptive — especially in large-scale environments.

Validation Steps

To perform validation for a specific data provider:

Open the LQE Data Providers page: https://host:port/lqe/web/admin/data-sources
Click on the name of the TRS feed you want to validate
Click the Validate button in the data provider page header
(Optional) Schedule validation for a specific date and time
(Optional) Check the box labeled Reset last processed event if you want to restart validation from the beginning of the TRS feed.
Click Validate to start the process

What Does Reset last processed event Option Do?

When you select the Reset last processed event checkbox, LQE will:

Discard the previously remembered "last validated event ID".
Restart validation from the very beginning of the TRS feed.
Re-examine all resources in the TRS feed rather than just changes since the last validation run.

This is useful in scenarios such as:

A previous validation run was interrupted or incomplete.
You suspect that early events in the TRS feed were skipped or not properly validated.
You want to ensure a full and comprehensive validation cycle, not just an incremental one.

Important Notes:

This setting does *not delete* any indexed data.
It may increase validation time, especially for large TRS feeds, since it reviews all past events.
If unchecked, LQE validates only the events that occurred since the last validation.

Interpreting Validation Results

Once validation completes:

Status and summary will appear on the History tab of the Data Providers page.
Click the status link of the validated TRS feed to view detailed result breakdowns.
The table includes metrics such as:
- Resources added — newly indexed items found in TRS but not in LQE
- Resources updated — stale (outdated) items re-fetched and replaced
- Resources removed — extra items present in LQE but not in TRS
- Resources skipped — items LQE could not process (may indicate issues)

Below is an example of validation result with highlighted values:

Overall Duration: The line near the top shows total time taken for validation: 1 days, 15 hours, 39 minutes, 34 seconds, 114 milliseconds This is the total wall-clock time required to validate the full TRS feed.

Graphs Column Breakdown: Under the Base processing section, the Graphs field shows the following: pgs 83,265,602 (173 / 173 / 0 / 83,265,429) These values mean:
- 83,265,602 — total number of resources LQE attempted to validate
- 173 — resources added
- 173 — resources updated (e.g., stale/missing)
- 0 — resources removed
- 83,265,429 — resources skipped (unchanged or not applicable)

Fetch Time Averages:
- Fetch Avg — average time (269 ms) for resource fetch operations across all threads
- TRS Fetch Avg — average TRS-specific fetch time (91 ms), useful for performance tuning

Removal Section: Indicates whether any extra resources were removed during validation. In this case, no removals occurred, and duration for the removal pass was under 4 minutes.

These results reflect a successful validation cycle, with only a small number of changes required and the majority of resources confirmed as consistent between LQE and the TRS feed.

Validation Logs

You can download logs from the Overview tab of the Data Provider page.

Log file name format: lqe-validation-<encoded_data_source_URL>.log
Logs categorize discrepancies into:
- Missing resources – Found in TRS, not in index
- Stale resources – TRS version is newer than indexed version
- Extra resources – Found in index, not in TRS

These logs are useful for troubleshooting specific resource issues or confirming data integrity after validation.

Indexing Performance Results

To assess the scalability and efficiency of LQE indexing across different storage technologies, we performed full reindexing and validation on a large repository containing 82 million resources. The goal was to evaluate how the indexing pipeline performs with different database backends: Oracle, Db2, and Jena.

The tests were conducted using a DOORS Next data source against the 720 M4 build.

Indexing Stages

The end-to-end reindexing operation consists of the following core phases:

Deletion — Removes all existing indexed data for the provider.
Fetching Resources — Retrieves all resources via TRS feed in batches.
Adding to LQErs Table — Commits validated data into the relational or Jena store.

Each phase contributes differently to overall indexing time depending on the database backend.

Timing Breakdown by Database

The following table summarizes the time taken for each indexing stage and total duration per database:

Database	Deletion Time	Fetch + Write Time	Total Time
Oracle	3 days	14.6 days	~17.6 days
Db2	1.3 days	13.4 days	~14.7 days
Jena	1.2 hours	26 days	~26 days

When indexed, the LQE database sizes are:

Jena LQE: 1277G
LQE rs Oracle: 842G
LQE rs Db2: 817G

LQE rs requires significantly less disk space for its index than LQE Jena.

Observations:

Oracle and Db2 showed similar performance for fetching and writing, but Oracle’s deletion phase was significantly slower.
Db2 had the best overall performance with a faster deletion process.
Jena was quickest for deletion but had the slowest write throughput, making total indexing much longer.
Write performance is the dominant factor in Jena’s overall indexing time.

Key Takeaways:

Relational databases (Oracle, Db2) offer better performance for large-scale indexing than Jena.
Optimization of the deletion process, especially in Oracle, can further reduce total indexing time.
Jena is not suited for indexing high volumes of resources.
Infrastructure (disk I/O, memory) and indexing configuration (thread/queue tuning) play a critical role in scaling.

Resource Utilization: CPU and Disk During Reindexing

As part of our reindexing performance validation, we monitored CPU and disk usage across the three key systems involved in the indexing workflow:

rtpclmperf24 → DNG Application Server
rtpclmperf41 → DNG Database (Db2)
rtpclmperf74 → LQE and LQE Database (Oracle)

Resource metrics were collected daily for a 5-hour span during the reindexing cycle. This allowed us to measure system behavior under consistent load conditions and verify that the infrastructure remained healthy throughout.

CPU Utilization

The DNG database server rtpclmperf41 showed the highest CPU usage at ~28%, which is expected due to its role in executing heavy read operations during resource fetches.
The DNG application server rtpclmperf24 and the Oracle LQE server rtpclmperf74 both showed low CPU utilization, consistently below 5%.
This usage pattern confirms efficient CPU distribution and no risk of processor saturation during the reindexing process.

Disk Activity

The highest disk throughput was observed on the Db2 database server rtpclmperf41, exceeding 110,000 KB/s. This is due to the large volume of read operations required to extract resources (~82 million resources) from the DNG database during reindexing.
Importantly, this high disk activity correlates with elevated CPU wait time — meaning the CPU spends time waiting for disk I/O to complete.
The Oracle-based LQE database server rtpclmperf74 showed moderate disk usage (~35,000 KB/s), which aligns with its responsibility to write the validated and indexed data into the LQE store.
The DNG application server rtpclmperf24 exhibited minimal disk activity, as it primarily handles TRS feed requests over HTTP with very low I/O demand.

Summary

These CPU and disk measurements validate that:

Resource utilization was well balanced across the infrastructure.
Each server performed according to its architectural role — Db2 focused on reads, Oracle on writes.
No performance bottlenecks or resource saturation was observed over multiple days of sustained reindexing activity.
The system maintained consistent throughput and stability, confirming that large-scale reindexing is both sustainable and performant when the environment is properly configured.

Indexing Configuration and Tuning Settings

To ensure stable and efficient indexing of the DOORS Next data source (82M resources), we made several configuration changes across the DOORS Next application server, the RMDB (DOORS Next database), and the IBM HTTP Server (IHS) layer.

RMDB (DOORS Next Application Database) Tuning

To improve the performance of large LOB (Large Object) storage in the RM database, we enabled Inline LOBs for the ITEM_STATES table. This reduces LOB lookup overhead during resource retrieval. Please note that RMDB is on Db2 database.

Commands used for enable Inline LOBs on the RM database:

ALTER TABLE REPOSITORY.ITEM_STATES ALTER COLUMN item_value SET INLINE LENGTH 2048;
REORG TABLE REPOSITORY.ITEM_STATES LONGLOBDATA;

This ensures that small LOBs (up to 2048 bytes) are stored inline with the table rows, reducing access latency during indexing.

DOORS Next Application Server Cache Configuration

To support faster access to configuration data and reduce server-side lookup times, we increased the cache size and expiration time for the DOORS Next application.

Add the following lines to the server.startup file:

JAVA_OPTS="$JAVA_OPTS -Dcom.ibm.rdm.configcache.expiration=21600"
JAVA_OPTS="$JAVA_OPTS -Dcom.ibm.rdm.configcache.size=12000"

Explanation:

configcache.expiration=21600 sets the cache timeout to 6 hours (in seconds)
configcache.size=12000 increases the max number of cached items

These settings helped reduce repeated configuration loads during the LQE indexing process.

IHS Timeout Configuration

To prevent premature I/O disconnects during long-running indexing operations—especially when fetching large batches of data—we updated the IHS plugin configuration.

Modified plugin-cfg.xml entry for the DOORS Next and LQE server nodes:

<Server ConnectTimeout="5" ExtendedHandshake="false"
        MaxConnections="-1" Name="rtpclmperf24Node01_server1"
        ServerIOTimeout="0" WaitForContinue="false">

Setting ServerIOTimeout to 0 disables the I/O timeout, allowing long-running requests from LQE to complete without being terminated by IHS.

Tuning Summary

These infrastructure and server-level optimizations contributed to smoother indexing and minimized interruption risks during extended runs. They are especially recommended for high-volume repositories where indexing time may span multiple days.

Measuring Indexing Phase Timings (Oracle)

To accurately measure the performance of indexing in LQE, we analyzed each phase — Deletion, Fetching, and Writing — using a combination of server-side metrics and database cache snapshots. This section describes how we calculated these timings for the Oracle-based environment.

Note: For Db2 and Jena, we used the built-in indexing time displayed on the History page of the DNG data provider. For example, shown in below screenshot for Db2.

The method below applies specifically to Oracle.

Deletion Phase Timing

The deletion phase timing was retrieved directly from the History page of the DNG data provider in LQE.

The DNG indexing history shows "Total removal time" for each reindex activity.
This value reflects the time taken to delete all previously indexed data from LQE for that data provider.

This value was used as-is and required no further adjustment.

Fetch Phase Timing

To calculate the time spent fetching resources from DNG during indexing, we used internal metrics from the DNG application.

Metrics were accessed via the URL: https://<dng-host>/rm/trs2utils?show=metrics
We focused on GET requests made by LQE to retrieve resources such as shapes, modules, and artifacts.
These metrics include:
- Total Requests — number of GET requests
- Total Response Time — cumulative time spent across all threads
Since DNG runs 3 threads in parallel to serve these requests, we divided the response time by 3 to approximate real elapsed time.

Below is example of DNG TRS metrics page.

Fetch Time Calculation Example:

For metric VersionedShapeRestService::doGet:

Metric	Day 2 Value	Day 1 Value	24h Difference
Total Requests	264,015	236,141	27,874
Total Response Time (ms)	47,840,655	43,161,459	4,679,196

In this case:

27,874 shape resources were fetched in 24 hours.
Total thread-level response time = 4,679,196 ms.
Since 3 threads were used:

Elapsed Time = Total Response Time / Number of Threads
             = 4,679,196 ms / 3
             = 1,559,732 ms 
             = 25.99 minutes

We repeated this process daily, recording the delta over 24-hour intervals for all major GET endpoints, then summed up the adjusted values to estimate the total fetch duration.

Write Phase Timing

To calculate write phase timing, we analyzed SQL execution metrics from the Oracle cache.

We collected Oracle cache data using a SQL query run daily (24-hour intervals).
We filtered for INSERT queries targeting LQE internal tables (prefix: RITRS.).
We focused on queries with high elapsed time and compared the cache snapshots to compute delta.
Since LQE uses 4 threads for writing, total time was divided by 4 to approximate elapsed time.

Query used to extract Oracle SQL cache data:

set linesize 999
set pagesize 999
set long 400000
column parsing_schema_name format a10

SELECT sql_id, parsing_schema_name, elapsed_time, child_number as CHD,
       disk_reads as READ, executions as EXEC, rows_processed as RWS,
       direct_reads as DREAD, buffer_gets as BGET, last_load_time as LDTIME
FROM v$sql
WHERE parsing_schema_name != 'SYS'
ORDER BY elapsed_time DESC;

Note: elapsed_time in the Oracle output is shown in microseconds. We aggregated elapsed time from relevant INSERT statements and divided by 4 (threads) to get estimated write phase duration for each 24-hour window.

Summary

Phase	How Timing Was Measured
Deletion	From the History page of the DNG data provider
Fetching	From DNG internal metrics (GET request response time ÷ 3 threads)
Writing	From Oracle cache query (elapsed time of INSERTs ÷ 4 threads)

This detailed extraction method allowed us to isolate and analyze how much time each stage of indexing contributed to the total duration—specifically for Oracle-based storage.

Indexing vs Validation

As part of our performance and recovery testing, we explored whether validation could serve as a recovery mechanism for failed indexing scenarios — such as an unexpected server restart.

Tests were conducted on a DOORS Next repository with 82 million resources across all three supported backends: Oracle, Db2, and Jena.

Can Validation Replace Indexing After Indexing Failure?

Validation is not recommended as a replacement for failed indexing recovery.

Here’s why:

In the event of a failure (e.g., server restart), the internal database state is unpredictable.
LQE may have already deleted a portion of the index before the failure occurred.
Starting validation at this point can lead to data loss, because:
- It will not restore deleted data that was not yet re-fetched
- It assumes the index is mostly valid, which may not be true in a partial indexing scenario

Conclusion: In case of a failed reindex, the only reliable option is to perform a full reindex to ensure data integrity and consistency.

When is Validation Recommended?

Validation is best used in well-defined, supported cases where the index may have drifted slightly from the TRS feed but is still largely intact.

You should use validation in the following situations:

A report is missing resources
A report includes outdated (stale) resources
A report shows the same artifact multiple times (e.g., with different names or attributes)
The number of resources in a JRS report does not match what is seen in the application
A report includes deleted resources
LQE or LDX displays skipped resources in status or logs

In these cases, validation can fix discrepancies by:

Adding missing resources
Removing extra resources
Updating stale resources

It does this without requiring a full reindex, saving time and reducing load on the system.

Validation vs. Reindexing: Usage Recommendations

Use Case	Recommended Action	Reason
Partial index after server failure	Full Reindexing	Database may be in an incomplete or inconsistent state
Missing or stale data in reports	Validation	Lightweight and targeted correction
Duplicate or extra resources	Validation	Removes outdated or redundant entries
Minor sync issues	Validation	Faster than reindexing and less impact on the system

Test Environment Details

The performance test environment consists of 3 different LQE servers:

An LQE server using a Jena index
An LQE rs server using Oracle as the backend database
An LQE rs server using Db2 as the backend database

Performance tests were conducted against all 3 different LQE servers, in order to example indexing performance on Oracle, DB2 and Jena.

#	Description	Specs	Notes
1	LQE Jena	40 vCPU, 768G RAM	Installed in LQE/Oracle server
2	LQE/Oracle	40 vCPU, 768G RAM	Oracle co-located with LQE
3	LQE/Db2	40 vCPU, 768G RAM	Db2 co-located with LQE

All three of these servers were configured to index a single ELM deployment. That deployment was populated with roughly 82 million resources of DNG. All three LQE servers therefore had identical indexes.

Deployment Topology

The test topology (including hardware specifications are shown below). There are two large physical servers that comprise the primary LQE test environment.

One server hosts LQE Jena, LQE for Oracle, and the Oracle database
One server hosts LQE for Db2, as well as the Db2 database

In an ideal topology, the database would be hosted on a server that is separate from LQE. But in this case, the servers have enough CPU and memory to allow for performance testing without concern about interaction between the components.

The servers in the test environment are all physical servers. The hardware specifications are listed below.

Role	Server	Machine type	Processor	Total processors	Memory	Storage	OS and version
Proxy Server	IBM HTTP Server and WebSphere Plugin	IBM System x3550 M3	2 x Intel Xeon X5667 3.07 GHz (quad-core)	16	16 GB	RAID 5 – 279GB SAS Disk x 2	RHEL 7
DNG	WebSphere Liberty	IBM System x3550 M4	2 x Intel Xeon E5-2640 2.5GHz (six-core)	24	64 GB	RAID 5 – 279GB SAS Disk x 4	RHEL 7
JTS/GC/LDX	WebSphere Liberty	IBM System x3550 M4	2 x Intel Xeon E5-2640 2.5GHz (six-core)	24	32 GB	RAID 5 – 279GB SAS Disk x 4	RHEL 7
ETM	WebSphere Liberty	IBM System x3550 M4	2 x Intel Xeon E5-2640 2.5GHz (six-core)	24	32 GB	RAID 5 – 279GB SAS Disk x 4	RHEL 7
EWM	WebSphere Liberty	IBM System x3550 M4	2 x Intel Xeon E5-2640 2.5GHz (six-core)	24	32 GB	RAID 5 – 279GB SAS Disk x 4	RHEL 7
LQE Oracle/Jena	Oracle 19c	IBM System SR650	2 x Xeon Silver 4114 10C 2.2GHz (ten-core)	40	768 GB	RAID 10 – 900GB SAS Disk x 16	RHEL 7
LQE Db2	Db2 11.5	IBM System SR650	2 x Xeon Silver 4114 10C 2.2GHz (ten-core)	40	768 GB	RAID 10 – 900GB SAS Disk x 16	RHEL 7

Conclusion

The evaluation of indexing and validation in LQE with a large-scale dataset (82 million resources) across Oracle, Db2, and Jena revealed valuable insights that can guide tuning, recovery decisions, and architecture choices in production environments.

Use validation for fast fixes and health checks: Validation is an efficient and low-impact operation to resolve common inconsistencies like missing, extra, or stale resources. It is particularly useful when users observe report discrepancies, outdated data, or skipped resources. Since it does not delete existing data, validation is faster and safer for routine integrity checks.

Use indexing for full repair or integrity resets: In cases of TRS inconsistencies, initial setup, or failed indexing (e.g., due to server crashes), validation cannot guarantee reliable recovery due to unpredictable database states. A full reindex ensures complete consistency between TRS feeds and the LQE index, making it the only dependable option for critical repairs.

Db2 and Oracle provide better performance than Jena: Performance tests showed that relational stores like Db2 and Oracle handled the fetching and writing phases more efficiently than Jena. While Jena offered a faster deletion phase, its overall indexing time was significantly higher due to slower triple-store write performance. Among relational stores, Db2 demonstrated the best overall results, especially in the deletion phase.

Optimal configuration of threads and queues is critical for efficiency: LQE's indexing performance heavily depends on how resources are fetched, processed, batched, and written. The four internal indexing queues (Fetch, Process, Batch, Write) must be properly sized, and thread usage should be tuned (recommended: 2–4 threads) to avoid bottlenecks. Aligning configuration with system capabilities leads to substantial performance gains and stability during high-volume indexing.

These findings provide practical guidance for administrators planning to scale LQE environments or maintain index health with minimal disruption. Strategic use of validation and informed tuning of indexing parameters will enable smoother operations and higher confidence in report accuracy.

Abbreviations:

JTS = Jazz Team Server
GC = Global configuration application
LDX = Link index provider
DOORS Next = DOORS Next Generation
EWM = Engineering Workflow Management
ETM = Engineering Test Management
LQE = Lifecycle Query Engine
LQE rs = Lifecycle Query Engine - Relational Store
RHEL = Red hat Enterprise Linux
IHS = IBM HTTP Server

References

Attachments

Topic attachments
I	Attachment	Action	Size	Date	Who	Comment
png	CPUUtilization.png	manage	227.0 K	2025-06-28 - 17:37	SameerkumarKalaria	CPUUtilization
png	DNGTRSMetrics.png	manage	558.9 K	2025-06-26 - 17:31	SameerkumarKalaria	DNGTRSMetrics
png	Db2IndexingTime.png	manage	116.6 K	2025-06-30 - 08:52	SameerkumarKalaria	Db2IndexingTime
png	DiskActivity.png	manage	339.0 K	2025-06-28 - 17:39	SameerkumarKalaria	DiskActivity
png	IndexingThreadSettings.png	manage	222.9 K	2025-06-26 - 05:32	SameerkumarKalaria	IndexingThreadSettings
png	Reindexing.png	manage	43.3 K	2025-06-30 - 05:11	SameerkumarKalaria	Reindexing
png	ReindexingPage.png	manage	257.2 K	2025-06-30 - 07:13	SameerkumarKalaria	ReindexingPage
png	ResourcePipelineQueues.png	manage	463.2 K	2025-06-27 - 18:41	SameerkumarKalaria	ResourcePipelineQueues
png	Topology.png	manage	25.7 K	2025-06-25 - 11:41	SameerkumarKalaria	Topology
png	Validation.png	manage	79.9 K	2025-06-26 - 06:34	SameerkumarKalaria	Validation
png	ValidationStatus.png	manage	174.2 K	2025-06-26 - 12:27	SameerkumarKalaria	ValidationStatus
png	indexing_performance_chart.png	manage	96.5 K	2025-06-23 - 12:33	SameerkumarKalaria	Indexing performance chart

Copyright © by IBM and non-IBM contributing authors. All material on this collaboration platform is the property of the contributing authors.
Contributions are governed by our Terms of Use. Please read the following disclaimer.
Dashboards and work items are no longer publicly available, so some links may be invalid. We now provide similar information through other means. Learn more here.