Resource Indexing Additional Service
The indexing described in this specification is used to allow client applications to not just find resources but to navigate resource sets and reason about the content of, and relationships between, resources. In this way it is expected that the server needs to know the relevant aspects of a resource and how to extract such information and provide a queryable index of the resource.
Motivation
The server builds two kinds of internal indexes: a property index and a full text index. Although similar in many ways, these are two separate internal indexes. This document focuses on the configuration rules for property indexes.
How the server extracts information from resources for its indexes is configurable via a collection of indexing rules. This document describes the configuration of the XML indexer. Index rules apply globally, to resources stored in the server. Within the server properties are logically stored as triples of the form: (subject, predicate, object) where the subject is the resource URI.
The Admin role is required to add, delete, or update indexing rules. It can also be a long-running operation, since rebuilding the server's indexes may require retrieving and re-indexing many resources. Changes like this are expected to happen infrequently; when they do, they may entail migrating existing resources from old to new format. Operations that create, update or delete indexing rules will return 403 Forbidden if the user does not have administration permission.
Specification
The {indexing-rules} URI identifies an Indexing Rules Collection resource that serves as a directory of existing indexing rules and a factory for creating new index rules. A URI of the form {indexing-rules}/{rule-id}identifies an index rule resource which describes a single indexing rule. (Note that the form of {rule-id} is unspecified; client MUST treat the URI as opaque.)
Summary of service URIs:
- {indexing-rules} - Indexing Rules feed resource
- {indexing-rules}/{rule-id} - Indexing Rule entry resource
Indexing Rules Collection resource
The Indexing Rules Collection resource provides a directory of all known Indexing Rule resources, a factory for creating new Indexing Rule resources, and a means for performing administrative operations like rebuilding the internal indexes.
Method | URL | Comments |
---|---|---|
GET/HEAD |
{indexing-rules} |
GET returns an atom:feed document with an atom:entry for each
indexing rule known to the server (requires special server admin
privileges - 403 Forbidden if not authorized). The "src" attribute of
the atom:content element is the URI of the Indexing Rule resource. The
atom:title for the entry describes the rule. The response MUST include accurate ETag and Last-Modified headers. |
POST |
{indexing-rules} |
POST is used to create a new indexing rule (requires special server admin privileges - 403 Forbidden if not authorized). The Content-Type MUST be "application/xml" and the representation MUST be a jazz:indexSpecification element (400 Bad request if it's not). The representation of an Indexing Rule resource is detailed below. The server MUST fail the operation (403 Forbidden) if the new indexing rule would conflict with the existing rules. The response MUST include a Location header with the URI of the newly-created Indexing Rule resource; the response body MUST be an accurate representation of the Indexing Rule resource and be accompanied by accurate ETag and Last-Modified headers. |
POST |
{indexing-rules}?reindex |
POST is also used to trigger server index maintenance. POSTing to the
URI /jazz/indexing-rules?reindex with an empty body requests that the
server rebuild its internal indexes (requires special server admin
privileges - 403 Forbidden if not authorized). Rebuilding the internal
indexes can be a lengthy operation. On receipt of the reindex POST the
server will return 202 Accepted with a Location response header
indicating a resource that they client can GET to discover the progress
of the long-running operation. This makes the long-running reindex
operation entirely asynchronous. This operation may not be executed
more than once concurrently, an attempt to start the operation when it
is already running will return 400 Bad Request. |
PUT |
{indexing-rules} |
PUT is not permitted (405 Method not allowed). The Indexing Rules Collection resource has no significant atom:feed header information that would require updating. |
DELETE |
{indexing-rules} |
DELETE is not permitted (405 Method not allowed). |
- GET returns a idx:indexSpecification document describing the particular indexing rule. The response MUST include accurate ETag and Last-Modified headers. The server MUST return 404 Not Found if the client has the necessary permissions but there is no such Indexing Rule resource.
- HEAD is supported with same rules as GET.
- PUT is used to update an existing indexing rule (requires the Admin role - otherwise 403 Forbidden). The Content-Type MUST be "application/xml" and the representation MUST be a valid idxnd:indexSpecification document (400 Bad Request if it's not). The server MUST fail the operation (403 Forbidden) if the modified indexing rule would be in conflict with any of the other index rules. When an indexing rule is updated, the server's internal indexes become stale, and will remain that way until they are rebuilt. The request MUST include non-trivial If-Match and If-Unmodified-Since headers (400 Bad request if headers not included; 409 Conflict if resource exists but preconditions fail; 412 Precondition failed if resource does not exist).
- POST is not permitted (405 Method not allowed).
- DELETE is used to delete an indexing rule (requires Admin role - otherwise 403 Forbidden). When an indexing rule is deleted, the server's internal indexes become stale, and will remain that way until they are rebuilt. The request MUST include non-trivial If-Match and If-Unmodified-Since headers (400 Bad request if headers not included; 409 Conflict if resource exists but preconditions fail; 412 Precondition failed if resource does not exist).
- Indexing Rule resources do not track revisions.
Indexing Rule resources
An Indexing Rule resource describes a particular indexing rule. An XML
property indexing rule is represented by a idx:indexSpecification element.
Indexing Rule Documents are in the following XML namespace
http://example.org/xmlns/openservices/v0.6
.
<?xml version="1.0"?> <indexSpecification xmlns="http://example.org/xmlns/openservices/v0.6" namespace="http://music.example.org/schema" onlyForType="application/x-com.ibm.examples.music+xml"> <index element="//title" /> <index element="//genre /> <index element="//release-date"> <property object="." objectType="date" /> </index> <index element="//cover-art"> <property object="./@href" objectType="uri" /> </index> </indexSpecification>
Method | URL | Comments |
---|---|---|
GET/HEAD |
{indexing-rules}/{rule-id} |
Requires Reader role, otherwise 403 (forbidden). |
POST |
{indexing-rules}/{rule-id} |
Not permitted - returns 405 (Method not allowed). |
PUT |
{indexing-rules}/{rule-id} |
Requires Admin role, otherwise 403 (forbidden). |
DELETE |
{indexing-rules}/{rule-id} |
Requires Admin role, otherwise 403 (forbidden). |
XML Representation Specification
XML Representation Summary: indexSpecificationElement | |
<indexSpecification |
|
Property | Representation |
{namespace} |
The required XML namespace URI, this is used to match XPath expressions for "element" and "secondaryResource" in each index element. |
{onlyForType} |
The optional content type match, this set of rules will only be evaluated against resources that are stored with the specified content type.. |
{index} |
The required set of indexing rules. |
The "namespace" attribute provides the URI of an XML namespace. There MUST NOT be more than most XML property indexing rule for any given XML namespace. The server MUST detect such conflicts when indexing rules are being created or updated, and fail the operation (403 Forbidden) if there are conflicts.
The "onlyForType" attribute provides a MIME content-type that allows clients to identify the specific type of resources to which this index applied. This optimization ensures that rules that are type-specific are not applied in general to all resources.
XML Representation Summary: indexElement |
|
<index |
|
Property | Representation |
{element} |
The required "XPath lite" expression to match an XML element in the target document. |
{property} |
The optional set of property indexing
rules to apply relative to the element matched by the
{element} attribute. |
Each idx:index
describes how to extract a set of triples from
the representation of the matched resource. The "element" attribute specifies
an "XPath lite" expression to match for the index rule to be applied.
Notes:
- If a
idx:index
element is specified with noidx:property
elements the predicate is the matching element's tag name.- If the target of the "element" XPath expression is an element, the
object is the text content of the element and the object type is
string
. - If the target of the "element" Xpath expression is an attribute, the
object is the value of the attribute and the object type is
string
.
- If the target of the "element" XPath expression is an element, the
object is the text content of the element and the object type is
- If a
idx:index
element is specified with one or moreidx:property
elements then all index triples will be extracted using the rules specified by the containedidx:property
elements only. - If a
idx:index
element specifies more than oneidx:property
and more than one of those elements resolve to an index triple then an RDF blank node corresponding to theidx:index
element will be created to contain all of the resolved index triples. The result of this is a compound-valued property for theidx:index
rule.
XML Representation Summary: secondaryResourceElement |
|
<secondaryResource |
|
Property | Representation |
{element} |
The required "XPath lite" expression to match an XML element or attribute in the target document. |
{property} |
The optional set of property indexing
rules to apply relative to the element matched by the
{element} attribute. |
{index} |
The optional set of nested indexing
rules to apply relative to the element matched by the
{element} attribute. |
Each idx:secondaryResource
describes how to extract a set of
triples from the representation of the matched resource. The extracted
properties will be considered as part of the secondary resource identified by
the "element" expression. The "element" attribute specifies an "XPath lite"
expression to match for the index rule to be applied.
Notes:
- The "element" attribute may specify either an element or an attribute to
match. The subject of the resulting index is determined by the type of node
that is matched.
- If an attribute is specified it is assumed to be an ID and the subject URI becomes the resource-uri + "#" + ID.
- If an element is specified the subject URI becomes the resource-uri + "#" + xpath path to the element.
- If a
idx:secondaryResource
element is specified with noidx:property
oridx:index
elements the predicate is the matching element's tag name.- If the target of the "element" XPath expression is an element, the
object is the text content of the element and the object type is
string
. - If the target of the "element" Xpath expression is an attribute, the
object is the value of the attribute and the object type is
string
.
- If the target of the "element" XPath expression is an element, the
object is the text content of the element and the object type is
- If a
idx:secondaryResource
element is specified with one or moreidx:property
oridx:index
elements then all index triples will be extracted using the rules specified by those contained elements only. - All index triples resolved by
idx:property
elements will be represented as simple triples belonging to the secondary resource. - Compound-value properties within a secondary resource can be indexed
using nested
idx:index
rules that specify multipleidx:property
elements.
XML Representation Summary: propertyElement |
|
<property |
|
Property | Representation |
{predicate} |
The optional "XPath lite" expression
used to denote the node (element or attribute) which contains the
predicate name (may also specify the XPath function
local-name() ). |
{object} |
The required "XPath lite" expression
used to denote the node (element or attribute) which contains the
object value (may also specify the XPath function
local-name() ). |
{objectType} |
The optional datatype of the object,
one of string , int , boolean ,
date , uri . |
Each idx:property
describes how to extract a single triple from
the representation of the matched resource; its attributes specify the
predicate, object and object type.
Notes:
- The predicate xpath expression identifies a node that contains the value to be used as the statement predicate. Any value must be an XML NCName and will be made into a URL using the namespace for the indexer.
- The object xpath expression identifies both the node which contains the value to be used as the statement object/value as well as the default predicate (if not explicity specified), using the following rules:
- If the object identifies an attribute or element then the default predicate is the qualified form of "namespace#node-name",
- If the object contains the XPath function
local-name()
then the value is taken to be the unqualified name of the node selected by secondaryResource or element, the default predicate becomes "http://www.w3.org/TR/xpath20#local-name".
Normalization of indexed URI properties
The indexer MUST normalize indexed URI properties so that queries for references to a particular resource can be resolved. For example, a query of the form:
/jazz/services/query? http://www.example.org/music#cover-art=/jazz/resources/myProject/covers/Hendrix1.png
should be resolveable even if the link was specified as relative in the
referencing document, as in <a
href="../covers/Hendrix1.png"/>
.
The following URI expansion rules are applied to normalize indexed URI properties:
- The protocol, host, and port are stripped off of URIs that refer to the host server
- Relative path references are expanded to absolute path references
- Relative paths are expanded with respect to the containing resource's URI if no base is specified by the resource. otherwise, relative paths are expended with respect to the resource's specified base.
XML documents may specify a base for which all relative links are to be
expanded to through the use of the standard xml:base
mechanism (http://www.w3.org/TR/xmlbase/). RFC
2396 [IETF RFC 2396] provides for base URI information to be embedded within a
document. The rules for determining the base URI can be summarized as follows
(highest priority to lowest):
- The base URI is embedded in the document's content.
- The base URI is that of the encapsulating entity (message, document, or none).
- The base URI is the URI used to retrieve the entity.
Multi-namespace documents
It is common, and indeed recommended, practice to design XML resources as
compound documents mixing in elements and attributes from different namespaces.
This practice is supported by the XML indexer in that the indexer will attempt
to apply all indexing rules that match the namespaces used by an
instance document. The indexer uses strong namespace matching rules to find the
"element" for idx:index
and idx:secondaryResource
rules, but is namespace-neutral in evaluating the XPath for the "predicate" and
"object" in idx:property
rules. This implies that the node
selected by a predicate or object expression may not be in the namespace of the
element. Note, that the indexer is namespace- neutral which means that it will
not match on the node namespace, but any node required to be indexed must have
a valid namespace, non-namespaced nodes are not matched.
XML Form | Comments |
---|---|
... <glossary:term glossary:id="t1" glossary:name="term 1"/> ... |
In this example the glossary and the attributes that are expected to be indexed are clearly in the same namespace, this rule is straightforward. |
... <glossary:term id="t1" name="term 1"/> ... |
In this example it is not clear whether the id and name attributes are part of the glossary namespace or not (it will depend on the attributeFormDefault value in the schema). This is a common situation and the indexer will handle it by treating attributes in a namespace-neutral manner. However, if the XPath processor determines that id and name have no associated namespace the indexer will fail to match them |
... <glossary:term base:id="t1" base:name="term 1"/> ... |
Due to the introduction of namespace-neutral attribute handling, this case works. The indexer will allow the definition of an index rule that matches the glossary term (as either an element or secondary resource) and then select id and name as index properties of the term. |
... <glossary:term> <glossary:id>t1</glossary:id> <glossary:name>term 1<glossary:name> </glossary:term> ... |
In this case the term and the child elements are once more all in the same namespace this is a straightforward indexing rule and demonstrated in the examples below. |
... <glossary:term> <base:id>t1</base:id> <base:name>term 1<base:name> </glossary:term> ... |
This rule also works as the indexer only specifies the matching namespace for the element and not the predicate and object. |
For example, consider the following index rule definition.
<idx:indexSpecification xmlns:idx="http://example.org/xmlns/openservices/v0.6" namespace="http://ibm/rdm/sketch-link" > <idx:secondaryResource element="//button/@id"> <property object="./label\" objectType="string"/> <property object="./local-name()" objectType="string"/> <property predicate=".//link/@rel" object=".//link/@href" objectType="uri"/> </idx:secondaryResource> <idx:secondaryResource element="//input/@id" <property object="./label\" objectType="string"/> <property object="./local-name()" objectType="string"/> <idxi:secondaryResource> </idx:indexSpecification>
Then, given the following resource we can see that the link elements should all be indexed regardless of namespace, however the third link on the first button will not because it does not specify any namespace at all.
<sketch xmlns="http://ibm/rdm/sketch-link" xmlns:base="http://ibm/rdm/base" xmlns:xhtml="http://www.w3.org/1999/xhtml" > <window> <button id="b1"> <label>First</label> <base:link base:rel="self" base:href="/jazz/resources/p/test1" /> <base:link base:rel="other" base:href="/jazz/resources/p/test2" /> <xhtml:link xhtml:rel="stylesheet" type="text/css" xhtml:href="/jazz/resources/p/stylesheet" /> <link rel="bogus" href="urn:no-namespace-here" /> </button> </window> <dialog> <frame> <panel> <button id="b2"> <label>Second</label> <base:link base:rel="self" base:href="/jazz/resources/p/test3" /> </button> <input id="i1"> <label>First</label> </input> </panel> </frame> </dialog> </sketch>
These then are the resulting index properties.
Subject | Predicate | Object | Type |
---|---|---|---|
project/structured-sketch-2.xml | ##rootElement | http://ibm/rdm/sketch#sketch | string |
project/structured-sketch-2.xml#b1 | http://ibm/rdm/sketch#label | First | string |
project/structured-sketch-2.xml#b1 | http://ibm/rdm/sketch#button | button | string |
project/structured-sketch-2.xml#b1 | http://ibm/rdm/base#self | /jazz/resources/p/test1 | uri |
project/structured-sketch-2.xml#b1 | http://ibm/rdm/base#other | /jazz/resources/p/test2 | uri |
project/structured-sketch-2.xml#b1 | http://www.w3.org/1999/xhtml#stylesheet | /jazz/resources/p/stylesheet | uri |
project/structured-sketch-2.xml#b2 | http://ibm/rdm/sketch#label | Second | string |
project/structured-sketch-2.xml#b2 | http://ibm/rdm/sketch#button | button | string |
project/structured-sketch-2.xml#b2 | http://ibm/rdm/base#self | /jazz/resources/p/test3 | uri |
project/structured-sketch-2.xml#i1 | http://ibm/rdm/sketch#label | First | string |
project/structured-sketch-2.xml#i1 | http://ibm/rdm/sketch#input | input | string |
Re-indexing
To ensure that resources are up-to-date with respect to index specifications an application can either:
- Ensure all index specifications are added prior to the addition of resources, or
- When index specifications need to be added or changed run a re-index operation.
The re-index operation scans all resources and re-applies all relevant
indexers and index rules to them to see if the stored index properties need to
be changed. This is a long-running operation and can have a significant
performance impart on the server and therefore should not be run frequently. To
invoke the operation the client application performs a POST method with the
single query parameter ?reindex
. The server will respond with a
202 (Accepted) status code and a Location response header that identifies a
resource that will hold the status of the operation. The general behavior is
therefore:
- Client performs POST to
/jazz/indexing-rules?reindex
. - Server responds with a status code of 202 and a Location response header
value of
/jazz/indexing-rules/progress/20080110T113741.762-0500
. - The client can then GET this progress resource, noting the value of the /operation/status element which will be "running" while executing and then "completed" when done.
- Once done, the /operation/errors/error* elements will contain any reported errors, or the client can assume the operation completed successfully.
- It is the client application responsibility to delete the progress resource when they have no further need for it.
The following is an example of the re-index progress resource.
<?xml version="1.0" encoding="utf-8"?> <operation xmlns="http://example.org/xmlns/openservices/v0.6"> <name>reindexing</name> <status>completed</status> <count>32</count> <errors/> </operation>
The re-index operation may not be run more than once at the same time, if the re-index operation is running then any subsequent call to re-index will respond with a 400 (bad request) status code.
XPath Subset used in index rules
The following describes the subset of XPath accepted by the indexer today. Note that the XPath used in element attributes MUST be an absolute expression, that is it must start with "/" or "//". In contrast the object and predicate expression MUST be a relative expression, that is it must start with ".", "./" or ".//". Also, predicates may specify the "literal()" function.
/foo
- matches the element "foo" if, and only if, it is the root element.//foo
- matches the element "foo" anywhere it appears in the document.- Deprecated:
foo
- shortcut for//foo
/@foo
- matches the attribute "foo" if, and only if, it is on the root element.//@foo
- matches the attribute "foo" anywhere it appears in the document.- Deprecated:
@foo
- shortcut for//@foo
/foo/bar
- matches the element "bar" if, and only if, it is the child of a root element named "foo".//foo/bar
- matches the element "bar" if it is the child of an element named "foo" that appears anywhere in the document//foo/bar/moo
- and so on..../foo
- matches the element "foo" if, and only if, it is a child of the "current" node - used in specifying predicates/objects for secondary resources..//foo
- matches the element "foo" anywhere it appears as a child of the "current" node - used in specifying predicates/objects for secondary resources../@foo
- matches the attribute "foo" if, and only if, it is a child of the "current" node - used in specifying predicates/objects for secondary resources..//@foo
- matches the attribute "foo" anywhere it appears as a child of the "current" node - used in specifying predicates/objects for secondary resources./local-name()
or./local-name()
or/foo/local-name()
or/@foo/local-name()
or./foo/local-name()
or./@foo/local-name()
- returns the name of the selected attribute or element node rather than it's content.literal(String)
- allows a property rule to specify a string literal for the predicate.
Note that actually each expression above is namespace aware and so an
example such as //foo/bar
logically maps to an XPath expression of
the form:
declare pre = namespace return //pre:foo/pre:bar
Examples
A Basic Example
The following XML property indexing rule is for XML namespace identified as "http://music.example.org/schema".
<?xml version="1.0" encoding="UTF-8"?> <indexSpecification xmlns="http://example.org/xmlns/openservices/v0.6" namespace="http://music.example.org/schema" onlyForType="application/x-com.ibm.examples.music+xml"> <index element="//title" /> <index element="//genre" /> <index element="//release-date"> <property object="." objectType="date" /> </index> <index element="//cover-art"> <property object="./@href" predicate="./local-name()" objectType"uri" /> </index> </indexSpecification>
When applied to a resource whose content-type was
application/x-com.ibm.examples.music+xml
and whose
XML-based representation is:
<?xml version="1.0" ?> <track xmlns="http://music.example.org/schema"> <title>Do you know the way to San Jose</title> <genre>pop</genre> <genre>rock</genre> <language>english</language> <release-date>1971-04-30T00:00:01Z</release-date> <cover-art href="http://music.example.org/cat-1248627636" /> </track>
The indexer will extract the following set of triples:
Subject | Predicate | Object | Type |
---|---|---|---|
resource-uri | http://music.example.org/schema#title | Do you know the way to San Jose | string |
resource-uri | http://music.example.org/schema#genre | pop | string |
resource-uri | http://music.example.org/schema#genre | rock | string |
resource-uri | http://music.example.org/schema#release-date | 1971-04-30T00:00:01Z | date |
resource-uri | http://music.example.org/schema#cover-art | http://music.example.org/cat-1248627636 | uri |
A Secondary Resource Example
The following is the common glossary document where the glossary itself has a name that should be indexed and then contains a set of terms that should then be uniquely indexed. Below is an example of such a glossary resource.
<Glossary xmlns="http://ibm/rdm/glossary" name="glossary1"> <term id="t1" name="term1" status="published" definition="term1 defined"/> <term id="t2" name="term2" status="published" definition="term2 defined"/> <term id="t3" name="term3" status="published" definition="term3 defined"/> </Glossary>
The following is an indexer specification that provides the correct set of index properties (triples) shown in the next table.
<indexSpecification namespace="http://ibm/rdm/glossary"> <index element="/Glossary"> <property object="./@name" /> </index> <secondaryResource element="//term@id"> <property object="./@name" /> <property object="./@status" /> <property object="./@definition" /> </secondaryResource> </indexSpecification>
Notes:
- To index the name of the glossary resource we use the element selector and then extract the subject from the name attribute of the selected element.
- To index the properties of each term we use the secondaryResource selector, but note that we actually identify the ID attribute of the term element which is then used as the subject ID.
- Each term property is extracted as an attribute selector, note that the XPath context (.) is assumed to be the element selected as //term.
The indexer will extract the following set of triples:
Subject | Predicate | Object | Type |
---|---|---|---|
resource-uri | http://ibm/rdm/glossary#name | glossary1 | string |
resource-uri#t1 | http://ibm/rdm/glossary#name | term1 | string |
resource-uri#t1 | http://ibm/rdm/glossary#status | published | string |
resource-uri#t1 | http://ibm/rdm/glossary#definition | term1 defined | string |
resource-uri#t2 | http://ibm/rdm/glossary#name | term2 | string |
resource-uri#t2 | http://ibm/rdm/glossary#status | published | string |
resource-uri#t2 | http://ibm/rdm/glossary#definition | term2 defined | string |
resource-uri#t3 | http://ibm/rdm/glossary#name | term3 | string |
resource-uri#t3 | http://ibm/rdm/glossary#status | published | string |
resource-uri#t3 | http://ibm/rdm/glossary#definition | term3 defined | string |
To construct the very same set of triples but from an element-centric form of the very same document we need to change the indexing rules only a little. Here's an example document.
<Glossary xmlns="http://ibm/rdm/glossary"> <name>glossary1</name> <term id="t1"> <name>term1</name> <status>published</status> <definition>term1 defined</definition> </term> <term id="t2"> <name>term2</name> <status>published</status> <definition>term2 defined</definition> </term> <term id="t3"> <name>term3</name> <status>published</status> <definition>term3 defined</definition> </term> </Glossary>
And here is the corresponding index specifications, note that the only real difference is the removal of the "@" character in each object selector.
<indexSpecification namespace="http://ibm/rdm/glossary"> <index element="/Glossary"> <property object="./name" /> </index> <secondaryResource element="//term@id" <property object="./name" /> <property object="./status" /> <property object="./definition" /> </secondaryResource> </indexSpecification>
Distinguishing Common Properties
In the following example, we have two particular needs:
- We need to be able to index and refer to each labeled widget. The button, input, and menu elements all have label properties and we would like to be able to run a query to find all things labeled "First" and get the URL for the discrete secondary resources.
- We need to be able to scope queries such that we can ask for things labeled "First" but only those that are buttons.
Given the example sketch below we need to be able to run the two queries above in some way.
<sketch xmlns="http://ibm/rdm/sketch" > <window> <button id="b1"> <label>First</label> </button> </window> <dialog> <frame> <panel> <button id="b2"> <label>Second</label> </button> <input id="i1"> <label>First</label> </input> <panel> <frame> <dialog> </sketch>
Using the local-name()
function we can index the local name of
the selected element along with it's value.
<indexSpecification namespace="http://ibm/rdm/sketch"> <secondaryResource element="//button@id"> <property object="./label" /> <property object="./local-name()" /> </secondaryResource> <secondaryResource element="//input@id"> <property object="./label" /> <property object="./local-name()" /> </secondaryResource> </indexSpecification>
The resulting triples shown below satisfy both the first and second queries.
Subject | Predicate | Object | Type |
---|---|---|---|
resource-uri#b1 | http://ibm/rdm/sketch#label | First | string |
resource-uri#b1 | http://www.w3.org/TR/xpath20#local-name | button | string |
resource-uri#b2 | http://ibm/rdm/sketch#label | Second | string |
resource-uri#b2 | http://www.w3.org/TR/xpath20#local-name | button | string |
resource-uri#i1 | http://ibm/rdm/sketch#label | First | string |
resource-uri#i1 | http://www.w3.org/TR/xpath20#local-name | input | string |
Alternatively, the predicate attribute and local-name function can override the default production of a predicate in the rules, as shown below.
<indexSpecification namespace="http://ibm/rdm/sketch"> <secondaryResource element="//button@id"> <property object="./label" predicate="./local-name()" /> </secondaryResource> <secondaryResource element="//input@id"> <property object="./label" predicate="./local-name()" /> </secondaryResource> </indexSpecification>
Resulting in:
Subject | Predicate | Object | Type |
---|---|---|---|
resource-uri#b1 | http://ibm/rdm/sketch#button | First | string |
resource-uri#b2 | http://ibm/rdm/sketch#button | Second | string |
resource-uri#i1 | http://ibm/rdm/sketch#input | First | string |
Using Predicate Selectors
Our next example introduces certain XML elements where the name of a predicate is logically held in part of the XML itself, this is commonly used to attach arbitrary name/value pairs to a document. The example shows such a property in both attribute and element centric forms.
<sketch xmlns="http://ibm/rdm/sketch" > <window><!-- elided --></window> <user-property name="property1" value="value1"/> <!-- or --> <user-property name="property2>value2</user-property> <!-- or --> <user-property><name>property3</name><value>value3</value></user-property> </sketch>
One way to accomplish this would be to treat each property as a secondary resource as we have below.
<indexSpecification namespace="http://ibm/rdm/sketch"> <secondaryResource element="//user-property"> <property object="./@name" /> <property object="./@value" /> </secondaryResource> <!-- or --> <secondaryResource element="//user-property"> <property object="./@name" /> <property object="." /> </secondaryResource> <!-- or --> <secondaryResource element="//user-property"> <property object="./name" /> <property object="./value" /> </secondaryResource> </indexSpecification>
Another important aspect of the indexed properties in the table below is that unlike the index specifications above the secondaryResource specifies an element and not an attribute. In this case there is no identifier that can be assumed for the secondary resource ID and so we construct an absolute XPath expression to the particular selected node.
Subject | Predicate | Object | Type |
---|---|---|---|
resource-uri#/sketch/user-property[0] | http://ibm/rdm/sketch#label | property1 | string |
resource-uri#/sketch/user-property[0] | http://ibm/rdm/sketch#value | value1 | string |
resource-uri#/sketch/user-property[1] | http://ibm/rdm/sketch#label | property2 | string |
resource-uri#/sketch/user-property[1] | http://ibm/rdm/sketch#value | value2 | string |
resource-uri#/sketch/user-property[2] | http://ibm/rdm/sketch#label | property3 | string |
resource-uri#/sketch/user-property[2] | http://ibm/rdm/sketch#value | value3 | string |
However, given this index rule...
<indexSpecification namespace="http://ibm/rdm/sketch"> <index element="//user-property"> <property predicate="./@name" object="./@value" /> </index> <!-- or --> <index element="//user-property"> <property predicate="./@name" object="." /> </index> <!-- or --> <index element="//user-property"> <property predicate="./name" object="./value" /> </index> </indexSpecification>
...we get these triples.
Subject | Predicate | Object | Type |
---|---|---|---|
resource-uri | http://ibm/rdm/sketch#property1 | value1 | string |
resource-uri | http://ibm/rdm/sketch#property2 | value2 | string |
resource-uri | http://ibm/rdm/sketch#property3 | value3 | string |
Built-in XML Property Indexing Rule for Atom Namespace
The following is the built-in XML property indexing rule for the Atom namespace. The rule extracts the "src" property of atom:content elements. This may allow clients to query for entry and collection resources that have a known URI as their content.
<?xml version="1.0" encoding="UTF-8"?> <indexSpecification xmlns="http://example.org/xmlns/openservices/v0.6" namespace="http://www.w3.org/2005/Atom"> <index element="//content" object="./@src" objectType="uri" /> </indexSpecification>
This particular indexing rule is built-in. The client MUST not update or delete this resource (403 Forbidden).