The purpose of this series of articles is to provide a basic understanding of the indexing processes in Jazz used by information query and search features, the technologies involved and to provide guidance on the associated administering tasks. We will also briefly review some details on the base architectural details, and how information is stored and queried in the different CLM applications.
This second part of the article series will discuss different administering tasks related with indices management and indices storage details.
For the remainder of this article is important to keep in mind what querying and indexing technology is applicable for each CLM application. You can find detailed information in the Part 1 of this article, summarized in the section called "Recap: Search and indexing in your CLM deployment".
In this article we will cover the storage recommendations and administration tasks for maintaining the indices used by JFS and FullText services. The Item Query Service is not relevant for this discussion as the query and indices features are supported by the database.
The storage of the indices is determined by a couple of application configuration properties, one for each type of index (JFS or FullText). These properties appear in the teamserver.properties file with the following default values after initial application post-installation setup:
com.ibm.team.jfs.index.root.directory=indices com.ibm.team.fulltext.indexLocation=conf/<APP>/indices/workitemindex
For example in a typical CCM application deployment, "<APP>" would be "ccm".
These teamserver.properties file entries have the corresponding parameter options in the Advanced Properties wizard for each application, which is accessed through the url of the form: https://<server>:<port>/<contextRoot>/admin#action=com.ibm.team.repository.admin.configureAdvanced
Keep in mind that all CLM applications but RM have a "teamserver.properties" file for the application configuration parameters such as database location, and the storage of JFS and FullText indices. In 5.0.x, RM no longer uses JTS for storage, querying and indexing and does have a teamserver.properties file. Note also that although all these properties exist in all the "teamserver.properties" files, the actual relevance of them in a particular CLM application will depend on the query and search technologies being used. Refer to Part 1 of this article for this mapping information between CLM applications and indexing technologies in use.
Now we have reviewed which configuration properties are used to configure the indices storage: how are those configuration properties used to determine the storage on disk?
The following screenshot shows the layout of a CCM application deployment using Tomcat/Derby and the default configuration:
This storage location can be modified by changing the value of the reviewed properties: using the Advanced Properties wizard (recommended), or by modifying the application's "teamserver.properties" file. Changing any of these values need the application to be restarted to take effect. Such a storage location change will create the empty folders to store the new indices contents, but no old indices will be migrated automatically: you will need to copy over from the old location the indices with the application shut down or perform a reindex to regenerate them in the new location.
We recommend the following configurations for indices storage management:
Enterprise level deployments usually consider one of the possible HA configurations for the CLM solution. We will review some considerations for the inexing storage in these configurations:
This section of the article will high-light some of the typical administration tasks that you may need to perform for the indices maintenance.
Given the importance that the indices have for querying and searching for information, it is crucial that you consider the indices backup and recovery procedures as part of your general CLM backup strategy. The backup of the indices should be taken along with the database to ensure information consistency: to have an snapshot of database information and indices content. For this backup process however, we need to differentiate how the indices in play differ in nature:
repotools-jts -backupJFSIndexes repositoryURL=https://JTS_SERVER:JTS_SERVER_PORT/jts adminUserID=****** adminPassword=****** toFile=FILELOCATION_AND_NAME
In spite of the possibility of performing an online backup of the JFS indices, given the restrictions that Fulltext indices impose for CCM and QM applications, it is advised to backup both indices when the CLM applications are shut down. However, as of 5.0.1, the restriction of Fulltext indices no longer applies since online backup of these indices is now supported. For complete information on backup please check the Backup the Rational solution for Collaborative Lifecycle Management.
Similarly, the recovery of indices should be performed along with the application repository recovery to ensure consistency of the information. Note that is particularly important for the FullText indices given how its contents are updated: a missed information update between repository contents and indices information will require you to perform an indices recreation. JFS indices would be able to recover nicely if information in database is ahead although a sync of both is still desired to avoid query results discrepancies and performance impact while information catches up. For 5.0.1 and above, Fulltext indices, like the JFS indices, recover online from the database contents so an offline restore/sync is no longer necessary.
The indices can be recreated. The situations in which you will have to consider indices recreation are:
Repository tools commands are available to perform this recreation, having different commands for the different type of indices.
repotools-jts.bat -reindex all
repotools-ccm.bat -rebuildTextIndices
Both commands need the server to be shut down before executing them. The time for recreating the indices can take long depending on how big your repository is. The example commands shown are for Windows platform based deployments. Corresponding commands for Unix/Linux platform deployments exist. Check the official documentation in the provided Information Center.
JFS RDF indices can grow large in time. A command is available that will allow you to compact them and save some space:
repotools-ccm.bat -compactDB
The compete syntax of the command would be:
repotools-jts.sh -compacttdb srcdir=<RDF indices location> tempdir=<temporary location to use>
There are two important considerations for this command to be run:
conf/jts/indices/<numericID>/jfs-rdfindex
. There is an enhancement request open to simplify this: repotools compacttdb should iterate through the indices folder to find the index to compress.
As of 5.0.1, a repotools command exists to verify JFS indices to ensure they are not corrupt.
The syntax is as follows
repotools-<application> -verifyindexes [teamserver.properties=conf/<application>/teamserver.properties] [mode=<quickCheck | extensiveCheck>] [logLevel=<errors | warnings | infos>]
Usually the default parameters are enough to verify the indexes integrity.
application
is jts, ccm, qm or rm.
teamserver.properties
defaults to conf/jts/teamserver.properties if not specified.
quickCheck
is the default verification mode selection. extensiveCheck, is a more expensive operation and verifies every Lucene index. It also reads and verifies all the Jena quads.
logLevel
defaults to errors. The warnings log level display information about discrepancies found in the indexes. Warnings do not mean that the indexes are corrupted. If the indexes are corrupted it is clearly reported when the command is completed. Warnings usually mean that there are more indexes that we expect and this could affect performance and results. For example a server not properly shut down could cause this situation and could easily be resolved by restarting the server. The infos log level is mainly useful for debugging purpose.
For example, to verify the JFS indices for the JTS using extensiveCheck verification mode and logging warnings only would be:
repotools-jts.bat -verifyJFSIndexes mode=extensiveCheck logLevel=warnings
Check complete command syntax details here.
As of 5.0.1, a repotools command exists to explicitly force synchronization of the JFS indices with the database.
The syntax is as follows:
repotools-<application> -synchronizeJFSIndexes [teamserver.properties=conf/<application>/teamserver.properties]
application
is jts, ccm, qm or rm.
teamserver.properties
defaults to conf/jts/teamserver.properties if not specified.
For example, to synchronize the JFS indices for the JTS would be:
repotools-jts.bat -synchronizeJFSIndexes
Check complete command syntax details here.
Status icon key: