Resource Indexing Additional Service

The indexing described in this specification is used to allow client applications to not just find resources but to navigate resource sets and reason about the content of, and relationships between, resources. In this way it is expected that the server needs to know the relevant aspects of a resource and how to extract such information and provide a queryable index of the resource.

Motivation

The server builds two kinds of internal indexes: a property index and a full text index. Although similar in many ways, these are two separate internal indexes. This document focuses on the configuration rules for property indexes.

How the server extracts information from resources for its indexes is configurable via a collection of indexing rules. This document describes the configuration of the XML indexer. Index rules apply globally, to resources stored in the server. Within the server properties are logically stored as triples of the form: (subject, predicate, object) where the subject is the resource URI.

The Admin role is required to add, delete, or update indexing rules. It can also be a long-running operation, since rebuilding the server's indexes may require retrieving and re-indexing many resources. Changes like this are expected to happen infrequently; when they do, they may entail migrating existing resources from old to new format. Operations that create, update or delete indexing rules will return 403 Forbidden if the user does not have administration permission.

Specification

The {indexing-rules} URI identifies an Indexing Rules Collection resource that serves as a directory of existing indexing rules and a factory for creating new index rules. A URI of the form {indexing-rules}/{rule-id}identifies an index rule resource which describes a single indexing rule. (Note that the form of {rule-id} is unspecified; client MUST treat the URI as opaque.)

Summary of service URIs:

  • {indexing-rules} - Indexing Rules feed resource
  • {indexing-rules}/{rule-id} - Indexing Rule entry resource

Indexing Rules Collection resource

The Indexing Rules Collection resource provides a directory of all known Indexing Rule resources, a factory for creating new Indexing Rule resources, and a means for performing administrative operations like rebuilding the internal indexes.

Method URL Comments
GET/HEAD {indexing-rules} GET returns an atom:feed document with an atom:entry for each indexing rule known to the server (requires special server admin privileges - 403 Forbidden if not authorized). The "src" attribute of the atom:content element is the URI of the Indexing Rule resource. The atom:title for the entry describes the rule.
The response MUST include accurate ETag and Last-Modified headers.
POST {indexing-rules} POST is used to create a new indexing rule (requires special server admin privileges - 403 Forbidden if not authorized). The Content-Type MUST be "application/xml" and the representation MUST be a jazz:indexSpecification element (400 Bad request if it's not). The representation of an Indexing Rule resource is detailed below. The server MUST fail the operation (403 Forbidden) if the new indexing rule would conflict with the existing rules. The response MUST include a Location header with the URI of the newly-created Indexing Rule resource; the response body MUST be an accurate representation of the Indexing Rule resource and be accompanied by accurate ETag and Last-Modified headers.
POST {indexing-rules}?reindex POST is also used to trigger server index maintenance. POSTing to the URI /jazz/indexing-rules?reindex with an empty body requests that the server rebuild its internal indexes (requires special server admin privileges - 403 Forbidden if not authorized). Rebuilding the internal indexes can be a lengthy operation. On receipt of the reindex POST the server will return 202 Accepted with a Location response header indicating a resource that they client can GET to discover the progress of the long-running operation. This makes the long-running reindex operation entirely asynchronous. This operation may not be executed more than once concurrently, an attempt to start the operation when it is already running will return 400 Bad Request.
PUT {indexing-rules} PUT is not permitted (405 Method not allowed). The Indexing Rules Collection resource has no significant atom:feed header information that would require updating.
DELETE {indexing-rules} DELETE is not permitted (405 Method not allowed).
  1. GET returns a idx:indexSpecification document describing the particular indexing rule. The response MUST include accurate ETag and Last-Modified headers. The server MUST return 404 Not Found if the client has the necessary permissions but there is no such Indexing Rule resource.
  2. HEAD is supported with same rules as GET.
  3. PUT is used to update an existing indexing rule (requires the Admin role - otherwise 403 Forbidden). The Content-Type MUST be "application/xml" and the representation MUST be a valid idxnd:indexSpecification document (400 Bad Request if it's not). The server MUST fail the operation (403 Forbidden) if the modified indexing rule would be in conflict with any of the other index rules. When an indexing rule is updated, the server's internal indexes become stale, and will remain that way until they are rebuilt. The request MUST include non-trivial If-Match and If-Unmodified-Since headers (400 Bad request if headers not included; 409 Conflict if resource exists but preconditions fail; 412 Precondition failed if resource does not exist).
  4. POST is not permitted (405 Method not allowed).
  5. DELETE is used to delete an indexing rule (requires Admin role - otherwise 403 Forbidden). When an indexing rule is deleted, the server's internal indexes become stale, and will remain that way until they are rebuilt. The request MUST include non-trivial If-Match and If-Unmodified-Since headers (400 Bad request if headers not included; 409 Conflict if resource exists but preconditions fail; 412 Precondition failed if resource does not exist).
  6. Indexing Rule resources do not track revisions.

Indexing Rule resources

An Indexing Rule resource describes a particular indexing rule. An XML property indexing rule is represented by a idx:indexSpecification element. Indexing Rule Documents are in the following XML namespace http://example.org/xmlns/openservices/v0.6.

<?xml version="1.0"?>
<indexSpecification xmlns="http://example.org/xmlns/openservices/v0.6"
     namespace="http://music.example.org/schema"
     onlyForType="application/x-com.ibm.examples.music+xml">
   <index element="//title" />
   <index element="//genre />
   <index element="//release-date">
      <property object="." objectType="date" />
   </index>
   <index element="//cover-art">
      <property object="./@href" objectType="uri" />
   </index>
</indexSpecification>
Method URL Comments
GET/HEAD {indexing-rules}/{rule-id} Requires Reader role, otherwise 403 (forbidden).
POST {indexing-rules}/{rule-id} Not permitted - returns 405 (Method not allowed).
PUT {indexing-rules}/{rule-id} Requires Admin role, otherwise 403 (forbidden).
DELETE {indexing-rules}/{rule-id} Requires Admin role, otherwise 403 (forbidden).

XML Representation Specification

XML Representation Summary: indexSpecificationElement
<indexSpecification
  namespace = xsd:anyURI
  onlyForType = xsd:string >
  Content: ((index | secondaryResource)+)
</indexSpecification>
Property Representation
{namespace} The required XML namespace URI, this is used to match XPath expressions for "element" and "secondaryResource" in each index element.
{onlyForType} The optional content type match, this set of rules will only be evaluated against resources that are stored with the specified content type..
{index} The required set of indexing rules.

The "namespace" attribute provides the URI of an XML namespace. There MUST NOT be more than most XML property indexing rule for any given XML namespace. The server MUST detect such conflicts when indexing rules are being created or updated, and fail the operation (403 Forbidden) if there are conflicts.

The "onlyForType" attribute provides a MIME content-type that allows clients to identify the specific type of resources to which this index applied. This optimization ensures that rules that are type-specific are not applied in general to all resources.

XML Representation Summary: indexElement

<index
  element = string >
  Content: (property*)
</index>
Property Representation
{element} The required "XPath lite" expression to match an XML element in the target document.
{property} The optional set of property indexing rules to apply relative to the element matched by the {element} attribute.

Each idx:index describes how to extract a set of triples from the representation of the matched resource. The "element" attribute specifies an "XPath lite" expression to match for the index rule to be applied.

Notes:

  1. If a idx:index element is specified with no idx:property elements the predicate is the matching element's tag name.
    1. If the target of the "element" XPath expression is an element, the object is the text content of the element and the object type is string.
    2. If the target of the "element" Xpath expression is an attribute, the object is the value of the attribute and the object type is string.
  2. If a idx:index element is specified with one or more idx:property elements then all index triples will be extracted using the rules specified by the contained idx:property elements only.
  3. If a idx:index element specifies more than one idx:property and more than one of those elements resolve to an index triple then an RDF blank node corresponding to the idx:index element will be created to contain all of the resolved index triples. The result of this is a compound-valued property for the idx:index rule.

XML Representation Summary: secondaryResourceElement

<secondaryResource
  element = string
  Content: (property*,index*)
</secondaryResource>
Property Representation
{element} The required "XPath lite" expression to match an XML element or attribute in the target document.
{property} The optional set of property indexing rules to apply relative to the element matched by the {element} attribute.
{index} The optional set of nested indexing rules to apply relative to the element matched by the {element} attribute.

Each idx:secondaryResource describes how to extract a set of triples from the representation of the matched resource. The extracted properties will be considered as part of the secondary resource identified by the "element" expression. The "element" attribute specifies an "XPath lite" expression to match for the index rule to be applied.

Notes:

  1. The "element" attribute may specify either an element or an attribute to match. The subject of the resulting index is determined by the type of node that is matched.
    1. If an attribute is specified it is assumed to be an ID and the subject URI becomes the resource-uri + "#" + ID.
    2. If an element is specified the subject URI becomes the resource-uri + "#" + xpath path to the element.
  2. If a idx:secondaryResource element is specified with no idx:property or idx:index elements the predicate is the matching element's tag name.
    1. If the target of the "element" XPath expression is an element, the object is the text content of the element and the object type is string.
    2. If the target of the "element" Xpath expression is an attribute, the object is the value of the attribute and the object type is string.
  3. If a idx:secondaryResource element is specified with one or more idx:property or idx:index elements then all index triples will be extracted using the rules specified by those contained elements only.
  4. All index triples resolved by idx:property elements will be represented as simple triples belonging to the secondary resource.
  5. Compound-value properties within a secondary resource can be indexed using nested idx:index rules that specify multiple idx:property elements.

XML Representation Summary: propertyElement

<property
  predicate = string
  object = string
  objectType = (string | int | boolean | date | uri) : string>
  Content: empty
</property>
Property Representation
{predicate} The optional "XPath lite" expression used to denote the node (element or attribute) which contains the predicate name (may also specify the XPath function local-name()).
{object} The required "XPath lite" expression used to denote the node (element or attribute) which contains the object value (may also specify the XPath function local-name()).
{objectType} The optional datatype of the object, one of string, int, boolean, date, uri.

Each idx:property describes how to extract a single triple from the representation of the matched resource; its attributes specify the predicate, object and object type.

Notes:

  1. The predicate xpath expression identifies a node that contains the value to be used as the statement predicate. Any value must be an XML NCName and will be made into a URL using the namespace for the indexer.
  2. The object xpath expression identifies both the node which contains the value to be used as the statement object/value as well as the default predicate (if not explicity specified), using the following rules:
  3. If the object identifies an attribute or element then the default predicate is the qualified form of "namespace#node-name",
  4. If the object contains the XPath function local-name() then the value is taken to be the unqualified name of the node selected by secondaryResource or element, the default predicate becomes "http://www.w3.org/TR/xpath20#local-name".

Normalization of indexed URI properties

The indexer MUST normalize indexed URI properties so that queries for references to a particular resource can be resolved. For example, a query of the form:

/jazz/services/query?
  http://www.example.org/music#cover-art=/jazz/resources/myProject/covers/Hendrix1.png

should be resolveable even if the link was specified as relative in the referencing document, as in <a href="../covers/Hendrix1.png"/>.

The following URI expansion rules are applied to normalize indexed URI properties:

  1. The protocol, host, and port are stripped off of URIs that refer to the host server
  2. Relative path references are expanded to absolute path references
  3. Relative paths are expanded with respect to the containing resource's URI if no base is specified by the resource. otherwise, relative paths are expended with respect to the resource's specified base.

XML documents may specify a base for which all relative links are to be expanded to through the use of the standard xml:base mechanism (http://www.w3.org/TR/xmlbase/). RFC 2396 [IETF RFC 2396] provides for base URI information to be embedded within a document. The rules for determining the base URI can be summarized as follows (highest priority to lowest):

  1. The base URI is embedded in the document's content.
  2. The base URI is that of the encapsulating entity (message, document, or none).
  3. The base URI is the URI used to retrieve the entity.

Multi-namespace documents

It is common, and indeed recommended, practice to design XML resources as compound documents mixing in elements and attributes from different namespaces. This practice is supported by the XML indexer in that the indexer will attempt to apply all indexing rules that match the namespaces used by an instance document. The indexer uses strong namespace matching rules to find the "element" for idx:index and idx:secondaryResource rules, but is namespace-neutral in evaluating the XPath for the "predicate" and "object" in idx:property rules. This implies that the node selected by a predicate or object expression may not be in the namespace of the element. Note, that the indexer is namespace- neutral which means that it will not match on the node namespace, but any node required to be indexed must have a valid namespace, non-namespaced nodes are not matched.

XML Form Comments
...
  <glossary:term glossary:id="t1" glossary:name="term 1"/>
...
In this example the glossary and the attributes that are expected to be indexed are clearly in the same namespace, this rule is straightforward.
...
  <glossary:term id="t1" name="term 1"/>
...
In this example it is not clear whether the id and name attributes are part of the glossary namespace or not (it will depend on the attributeFormDefault value in the schema). This is a common situation and the indexer will handle it by treating attributes in a namespace-neutral manner. However, if the XPath processor determines that id and name have no associated namespace the indexer will fail to match them
...
  <glossary:term base:id="t1" base:name="term 1"/>
...
Due to the introduction of namespace-neutral attribute handling, this case works. The indexer will allow the definition of an index rule that matches the glossary term (as either an element or secondary resource) and then select id and name as index properties of the term.
...
  <glossary:term>
     <glossary:id>t1</glossary:id>
     <glossary:name>term 1<glossary:name>
  </glossary:term>
...
In this case the term and the child elements are once more all in the same namespace this is a straightforward indexing rule and demonstrated in the examples below.
...
  <glossary:term>
     <base:id>t1</base:id>
     <base:name>term 1<base:name>
  </glossary:term>
...
This rule also works as the indexer only specifies the matching namespace for the element and not the predicate and object.

For example, consider the following index rule definition.

<idx:indexSpecification xmlns:idx="http://example.org/xmlns/openservices/v0.6"
  namespace="http://ibm/rdm/sketch-link" >
  <idx:secondaryResource element="//button/@id">
    <property object="./label\" objectType="string"/>
    <property object="./local-name()" objectType="string"/>
    <property predicate=".//link/@rel" object=".//link/@href" objectType="uri"/>
  </idx:secondaryResource>
  <idx:secondaryResource element="//input/@id"
    <property object="./label\" objectType="string"/>
    <property object="./local-name()" objectType="string"/>
  <idxi:secondaryResource>
</idx:indexSpecification>

Then, given the following resource we can see that the link elements should all be indexed regardless of namespace, however the third link on the first button will not because it does not specify any namespace at all.

<sketch xmlns="http://ibm/rdm/sketch-link" 
        xmlns:base="http://ibm/rdm/base" 
        xmlns:xhtml="http://www.w3.org/1999/xhtml" >
  <window>
    <button id="b1">
      <label>First</label>
      <base:link base:rel="self" base:href="/jazz/resources/p/test1" />
      <base:link base:rel="other" base:href="/jazz/resources/p/test2" />
      <xhtml:link xhtml:rel="stylesheet" type="text/css" xhtml:href="/jazz/resources/p/stylesheet" />
      <link rel="bogus" href="urn:no-namespace-here" />
    </button>
  </window>
  <dialog>
    <frame>
      <panel>
        <button id="b2">
          <label>Second</label>
          <base:link base:rel="self" base:href="/jazz/resources/p/test3" />
        </button>
        <input id="i1">
          <label>First</label>
        </input>
      </panel>
    </frame>
  </dialog>
</sketch>

These then are the resulting index properties.

Subject Predicate Object Type
project/structured-sketch-2.xml ##rootElement http://ibm/rdm/sketch#sketch string
project/structured-sketch-2.xml#b1 http://ibm/rdm/sketch#label First string
project/structured-sketch-2.xml#b1 http://ibm/rdm/sketch#button button string
project/structured-sketch-2.xml#b1 http://ibm/rdm/base#self /jazz/resources/p/test1 uri
project/structured-sketch-2.xml#b1 http://ibm/rdm/base#other /jazz/resources/p/test2 uri
project/structured-sketch-2.xml#b1 http://www.w3.org/1999/xhtml#stylesheet /jazz/resources/p/stylesheet uri
project/structured-sketch-2.xml#b2 http://ibm/rdm/sketch#label Second string
project/structured-sketch-2.xml#b2 http://ibm/rdm/sketch#button button string
project/structured-sketch-2.xml#b2 http://ibm/rdm/base#self /jazz/resources/p/test3 uri
project/structured-sketch-2.xml#i1 http://ibm/rdm/sketch#label First string
project/structured-sketch-2.xml#i1 http://ibm/rdm/sketch#input input string

Re-indexing

To ensure that resources are up-to-date with respect to index specifications an application can either:

  • Ensure all index specifications are added prior to the addition of resources, or
  • When index specifications need to be added or changed run a re-index operation.

The re-index operation scans all resources and re-applies all relevant indexers and index rules to them to see if the stored index properties need to be changed. This is a long-running operation and can have a significant performance impart on the server and therefore should not be run frequently. To invoke the operation the client application performs a POST method with the single query parameter ?reindex. The server will respond with a 202 (Accepted) status code and a Location response header that identifies a resource that will hold the status of the operation. The general behavior is therefore:

  • Client performs POST to /jazz/indexing-rules?reindex.
  • Server responds with a status code of 202 and a Location response header value of /jazz/indexing-rules/progress/20080110T113741.762-0500.
  • The client can then GET this progress resource, noting the value of the /operation/status element which will be "running" while executing and then "completed" when done.
  • Once done, the /operation/errors/error* elements will contain any reported errors, or the client can assume the operation completed successfully.
  • It is the client application responsibility to delete the progress resource when they have no further need for it.

The following is an example of the re-index progress resource.

<?xml version="1.0" encoding="utf-8"?>
<operation xmlns="http://example.org/xmlns/openservices/v0.6">
  <name>reindexing</name>
  <status>completed</status>
  <count>32</count>
  <errors/>
</operation>

The re-index operation may not be run more than once at the same time, if the re-index operation is running then any subsequent call to re-index will respond with a 400 (bad request) status code.

XPath Subset used in index rules

The following describes the subset of XPath accepted by the indexer today. Note that the XPath used in element attributes MUST be an absolute expression, that is it must start with "/" or "//". In contrast the object and predicate expression MUST be a relative expression, that is it must start with ".", "./" or ".//". Also, predicates may specify the "literal()" function.

  • /foo - matches the element "foo" if, and only if, it is the root element.
  • //foo - matches the element "foo" anywhere it appears in the document.
  • Deprecated: foo - shortcut for //foo
  • /@foo - matches the attribute "foo" if, and only if, it is on the root element.
  • //@foo - matches the attribute "foo" anywhere it appears in the document.
  • Deprecated: @foo - shortcut for //@foo
  • /foo/bar - matches the element "bar" if, and only if, it is the child of a root element named "foo".
  • //foo/bar - matches the element "bar" if it is the child of an element named "foo" that appears anywhere in the document
  • //foo/bar/moo - and so on...
  • ./foo - matches the element "foo" if, and only if, it is a child of the "current" node - used in specifying predicates/objects for secondary resources.
  • .//foo - matches the element "foo" anywhere it appears as a child of the "current" node - used in specifying predicates/objects for secondary resources.
  • ./@foo - matches the attribute "foo" if, and only if, it is a child of the "current" node - used in specifying predicates/objects for secondary resources.
  • .//@foo - matches the attribute "foo" anywhere it appears as a child of the "current" node - used in specifying predicates/objects for secondary resources.
  • /local-name() or ./local-name() or /foo/local-name() or /@foo/local-name() or ./foo/local-name() or ./@foo/local-name() - returns the name of the selected attribute or element node rather than it's content.
  • literal(String) - allows a property rule to specify a string literal for the predicate.

Note that actually each expression above is namespace aware and so an example such as //foo/bar logically maps to an XPath expression of the form:

declare pre = namespace
  return //pre:foo/pre:bar

Examples

A Basic Example

The following XML property indexing rule is for XML namespace identified as "http://music.example.org/schema".

<?xml version="1.0" encoding="UTF-8"?>
<indexSpecification xmlns="http://example.org/xmlns/openservices/v0.6" 
     namespace="http://music.example.org/schema"
     onlyForType="application/x-com.ibm.examples.music+xml">
   <index element="//title" />
   <index element="//genre" />
   <index element="//release-date">
     <property object="." objectType="date" />
   </index>
   <index element="//cover-art">
     <property object="./@href" predicate="./local-name()" objectType"uri" />
   </index>
</indexSpecification>

When applied to a resource whose content-type was application/x-com.ibm.examples.music+xml and whose XML-based representation is:

<?xml version="1.0" ?>
<track xmlns="http://music.example.org/schema">
  <title>Do you know the way to San Jose</title>
  <genre>pop</genre>
  <genre>rock</genre>
  <language>english</language>
  <release-date>1971-04-30T00:00:01Z</release-date>
  <cover-art href="http://music.example.org/cat-1248627636" />
</track>

The indexer will extract the following set of triples:

Subject Predicate Object Type
resource-uri http://music.example.org/schema#title Do you know the way to San Jose string
resource-uri http://music.example.org/schema#genre pop string
resource-uri http://music.example.org/schema#genre rock string
resource-uri http://music.example.org/schema#release-date 1971-04-30T00:00:01Z date
resource-uri http://music.example.org/schema#cover-art http://music.example.org/cat-1248627636 uri

A Secondary Resource Example

The following is the common glossary document where the glossary itself has a name that should be indexed and then contains a set of terms that should then be uniquely indexed. Below is an example of such a glossary resource.

<Glossary xmlns="http://ibm/rdm/glossary" name="glossary1">
  <term id="t1" name="term1" status="published" definition="term1 defined"/>
  <term id="t2" name="term2" status="published" definition="term2 defined"/>
  <term id="t3" name="term3" status="published" definition="term3 defined"/>
</Glossary>

The following is an indexer specification that provides the correct set of index properties (triples) shown in the next table.

<indexSpecification namespace="http://ibm/rdm/glossary">
  <index element="/Glossary">
    <property object="./@name" />
  </index>
  <secondaryResource element="//term@id">
    <property object="./@name" />
    <property object="./@status" />
    <property object="./@definition" />
  </secondaryResource>
</indexSpecification>

Notes:

  1. To index the name of the glossary resource we use the element selector and then extract the subject from the name attribute of the selected element.
  2. To index the properties of each term we use the secondaryResource selector, but note that we actually identify the ID attribute of the term element which is then used as the subject ID.
  3. Each term property is extracted as an attribute selector, note that the XPath context (.) is assumed to be the element selected as //term.

The indexer will extract the following set of triples:

Subject Predicate Object Type
resource-uri http://ibm/rdm/glossary#name glossary1 string
resource-uri#t1 http://ibm/rdm/glossary#name term1 string
resource-uri#t1 http://ibm/rdm/glossary#status published string
resource-uri#t1 http://ibm/rdm/glossary#definition term1 defined string
resource-uri#t2 http://ibm/rdm/glossary#name term2 string
resource-uri#t2 http://ibm/rdm/glossary#status published string
resource-uri#t2 http://ibm/rdm/glossary#definition term2 defined string
resource-uri#t3 http://ibm/rdm/glossary#name term3 string
resource-uri#t3 http://ibm/rdm/glossary#status published string
resource-uri#t3 http://ibm/rdm/glossary#definition term3 defined string

To construct the very same set of triples but from an element-centric form of the very same document we need to change the indexing rules only a little. Here's an example document.

<Glossary xmlns="http://ibm/rdm/glossary">
  <name>glossary1</name>
  <term id="t1">
    <name>term1</name>
    <status>published</status>
    <definition>term1 defined</definition>
  </term>
  <term id="t2">
    <name>term2</name>
    <status>published</status>
    <definition>term2 defined</definition>
  </term>
  <term id="t3">
    <name>term3</name>
    <status>published</status>
    <definition>term3 defined</definition>
  </term>
</Glossary>

And here is the corresponding index specifications, note that the only real difference is the removal of the "@" character in each object selector.

<indexSpecification namespace="http://ibm/rdm/glossary">
  <index element="/Glossary">
    <property object="./name" />
  </index>
  <secondaryResource element="//term@id"
    <property object="./name" />
    <property object="./status" />
    <property object="./definition" />
  </secondaryResource>
</indexSpecification>

Distinguishing Common Properties

In the following example, we have two particular needs:

  1. We need to be able to index and refer to each labeled widget. The button, input, and menu elements all have label properties and we would like to be able to run a query to find all things labeled "First" and get the URL for the discrete secondary resources.
  2. We need to be able to scope queries such that we can ask for things labeled "First" but only those that are buttons.

Given the example sketch below we need to be able to run the two queries above in some way.

<sketch xmlns="http://ibm/rdm/sketch" >
  <window>
    <button id="b1">
      <label>First</label>
    </button>
  </window>
  <dialog>
    <frame>
      <panel>
        <button id="b2">
          <label>Second</label>
        </button>
        <input id="i1">
          <label>First</label>
        </input>
      <panel>
    <frame>
  <dialog>
</sketch>

Using the local-name() function we can index the local name of the selected element along with it's value.

<indexSpecification namespace="http://ibm/rdm/sketch">
  <secondaryResource element="//button@id">
    <property object="./label" />
    <property object="./local-name()" />
  </secondaryResource>
  <secondaryResource element="//input@id">
    <property object="./label" />
    <property object="./local-name()" />
  </secondaryResource>
</indexSpecification>

The resulting triples shown below satisfy both the first and second queries.

Subject Predicate Object Type
resource-uri#b1 http://ibm/rdm/sketch#label First string
resource-uri#b1 http://www.w3.org/TR/xpath20#local-name button string
resource-uri#b2 http://ibm/rdm/sketch#label Second string
resource-uri#b2 http://www.w3.org/TR/xpath20#local-name button string
resource-uri#i1 http://ibm/rdm/sketch#label First string
resource-uri#i1 http://www.w3.org/TR/xpath20#local-name input string

Alternatively, the predicate attribute and local-name function can override the default production of a predicate in the rules, as shown below.

<indexSpecification namespace="http://ibm/rdm/sketch">
  <secondaryResource element="//button@id">
    <property object="./label" predicate="./local-name()" />
  </secondaryResource>
  <secondaryResource element="//input@id">
    <property object="./label" predicate="./local-name()" />
  </secondaryResource>
</indexSpecification>

Resulting in:

Subject Predicate Object Type
resource-uri#b1 http://ibm/rdm/sketch#button First string
resource-uri#b2 http://ibm/rdm/sketch#button Second string
resource-uri#i1 http://ibm/rdm/sketch#input First string

Using Predicate Selectors

Our next example introduces certain XML elements where the name of a predicate is logically held in part of the XML itself, this is commonly used to attach arbitrary name/value pairs to a document. The example shows such a property in both attribute and element centric forms.

<sketch xmlns="http://ibm/rdm/sketch" >
  <window><!-- elided --></window>
  <user-property name="property1" value="value1"/>
  <!-- or -->
  <user-property name="property2>value2</user-property>
  <!-- or -->
  <user-property><name>property3</name><value>value3</value></user-property>
</sketch>

One way to accomplish this would be to treat each property as a secondary resource as we have below.

<indexSpecification namespace="http://ibm/rdm/sketch">
  <secondaryResource element="//user-property">
    <property object="./@name" />
    <property object="./@value" />
  </secondaryResource>
  <!-- or -->
  <secondaryResource element="//user-property">
    <property object="./@name" />
    <property object="." />
  </secondaryResource>
  <!-- or -->
  <secondaryResource element="//user-property">
    <property object="./name" />
    <property object="./value" />
  </secondaryResource>
</indexSpecification>

Another important aspect of the indexed properties in the table below is that unlike the index specifications above the secondaryResource specifies an element and not an attribute. In this case there is no identifier that can be assumed for the secondary resource ID and so we construct an absolute XPath expression to the particular selected node.

Subject Predicate Object Type
resource-uri#/sketch/user-property[0] http://ibm/rdm/sketch#label property1 string
resource-uri#/sketch/user-property[0] http://ibm/rdm/sketch#value value1 string
resource-uri#/sketch/user-property[1] http://ibm/rdm/sketch#label property2 string
resource-uri#/sketch/user-property[1] http://ibm/rdm/sketch#value value2 string
resource-uri#/sketch/user-property[2] http://ibm/rdm/sketch#label property3 string
resource-uri#/sketch/user-property[2] http://ibm/rdm/sketch#value value3 string

However, given this index rule...

<indexSpecification namespace="http://ibm/rdm/sketch">
  <index element="//user-property">
    <property predicate="./@name" object="./@value" />
  </index>
  <!-- or -->
  <index element="//user-property">
    <property predicate="./@name" object="." />
  </index>
  <!-- or -->
  <index element="//user-property">
    <property predicate="./name" object="./value" />
  </index>
</indexSpecification>

...we get these triples.

Subject Predicate Object Type
resource-uri http://ibm/rdm/sketch#property1 value1 string
resource-uri http://ibm/rdm/sketch#property2 value2 string
resource-uri http://ibm/rdm/sketch#property3 value3 string

Built-in XML Property Indexing Rule for Atom Namespace

The following is the built-in XML property indexing rule for the Atom namespace. The rule extracts the "src" property of atom:content elements. This may allow clients to query for entry and collection resources that have a known URI as their content.

<?xml version="1.0" encoding="UTF-8"?>
<indexSpecification xmlns="http://example.org/xmlns/openservices/v0.6"
      namespace="http://www.w3.org/2005/Atom">
   <index element="//content" object="./@src" objectType="uri" />
</indexSpecification>

This particular indexing rule is built-in. The client MUST not update or delete this resource (403 Forbidden).

Examples