r9 - 2020-07-14 - 19:48:05 - Main.ndjcYou are here: TWiki >

LinkedData Web > BestPractices > SpecifyUrisForCustomTerms

This wiki: The development wiki is a work area where Jazz development teams plan and discuss technical designs and operations for the projects at Jazz.net. Work items often link to documents here. You are welcome to browse, follow along, and participate. Participation is what Jazz.net is all about! But please keep in mind that information here is "as is", unsupported, and may be outdated or inaccurate. For information on released products, consult IBM Knowledge Center, support tech notes, and the Jazz.net library. See also the Jazz.net Terms of Use.

Any documentation or reference material found in this wiki is not official product documentation, but it is primarily for the use of the development teams. For your end use, you should consult official product documentation (infocenters), IBM.com support artifacts (tech notes), and the jazz.net library as officially "stamped" resources.

Best Practice: Enable Users to Specify URIs for Custom Attributes and Values

State: Approved

Contact: Nick Crossley

Scope

Most Jazz development tools allow administrative users to define custom properties, link types, enumerated values, and other similar data elements. These user-defined data elements are normally scoped to some context within the tools, e.g. to project areas within Engineering Workflow Manager (EWM). Although this capability is very valuable to users, it leads to complexity when users need to query data that comes from multiple tools or multiple contexts within a tool.

For example, suppose the concept of priority is not built-in to a tool and that users at a company, Acme, have added it to two contexts as "Priority" in one and "Urgency" in the other. Furthermore, suppose the allowed values for this property are "High", "Medium" and "Low" in one context, and "P1", "P2", "P3", and "P4". A query such as "find all the high priority items in all contexts" now becomes problematic since the query author needs to know what each context means by high priority. Furthermore, when a new context is added, it might use yet another set of terms and so the query would need to be updated. This problem will become acute for Lifecycle Query Engine (LQE) since it combines data from multiple contexts.

The purpose of this recommendation is define a best practice for representing these user-defined data elements in RDF so that cross-context queries are easier to author and maintain.

This Best Practice is closely related to Best Practice: Use URIs to Represent Enumerated Values which discusses how to define enumerated values in RDF vocabularies.

This Best Practice does not address URIs for RDF Data Types - see Use of RDF Data Types.

Recommendation

Generate and publish unique URIs for user-defined data elements

When a user creates a new user-defined data element (link type, property type, or enumerated property value), the tool MUST generate a unique HTTP URI. The generated URI SHOULD be as human-readable as possible. One method for achieving human-readability is to apply a human-friendly name mangling algorithm to the user-defined label for the data element. The tool MUST make the generated URI dereferencable via HTTP GET as per the W3C Best Practices for Publishing RDF Vocabularies. The published RDF vocabulary SHOULD include any relevant descriptive information provided the user as part of the data element definition, e.g. the label and description.

Enable users to optionally specify an external URI for user-defined data elements

During the creation of user-defined data element described above, the tool MUST allow the user to optionally specify their own 'external' HTTP URI for that data element. The specified URI MAY be selected from an industry standard such as Dublin Core (DC), Friend of a Friend (FOAF), or Open Services for Lifecycle Collaboration (OSLC), or it MAY be defined by the user's organization. If the URI is defined by the user's organization, then the user's organization SHOULD make it dereferencable via HTTP GET as per the W3C Best Practice Recipes for Publishing RDF Vocabularies.

Consider the URI to be used in data instances

The URI used in the RDF representation of all resources that contain a user-defined data element must be considered with care. Using the URI selected by the user, as defined above, is one obvious choice. However, that has implications on indexing and query stability if it is likely, or even possible, that the user may change the URI. Where the user can change the URI, it is better to use the immutable system-generated URI in the representations of data instances, and use owl:sameAs to equate that URI to the one of the user's choice. With this approach, if the user changes the URI they want to use, only the resource containing the owl:sameAs statement needs to be updated. The owl:sameAs statement may be used either by an inferencing engine or explicitly used in queries; there is a run-time cost to this approach, but that cost may be deemed acceptable when compared with the cost of rebuilding query indexes.

Queries, report builders, etc., should be designed with both of these approaches in mind.

Note that owl:sameAs is a symmetric statement: it asserts that two URIs are the same as each other. If an inferencing engine is used, this symmetry is handled by that engine. If explicit queries are used, query writers or builders MUST query for the owl:sameAs assertion in either direction - a property path of the form owl:sameAs|^owl:sameAs is a suitable way to express this.

Examples

Example setup - variable URIs for property values

The adminstrator of project 1 uses Priority=High, Medium, and Low. The administrator of project 2 uses Urgency=P1, P2, P3, and P4. Nonetheless, the administrators want to enable cross-project reporting. The tool MUST therefore allow the administrators to specify the external URIs with company standard values:

Project	System URI	Label	External URI
1	<server1UriForPriority>	Priority	`http://acme.example.com/ns/alm#priority`
1	<server1UriForHigh>	High	`http://acme.example.com/ns/alm#P1`
1	<server1UriForMedium>	Medium	`http://acme.example.com/ns/alm#P2`
1	<server1UriForLow>	Low	`http://acme.example.com/ns/alm#P3`
2	<server2UriForUrgency>	Urgency	`http://acme.example.com/ns/alm#priority`
2	<server2UriForP1>	P1	`http://acme.example.com/ns/alm#P1`
2	<server2UriForP2>	P2	`http://acme.example.com/ns/alm#P2`
2	<server2UriForP3>	P3	`http://acme.example.com/ns/alm#P3`
2	<server2UriForP4>	P4	`http://acme.example.com/ns/alm#P4`

The tools MUST generate vocabularies where owl:sameAs is used to relate the system and external URIs, as shown in the partial term definitions below:

<server1UriForPriority>
   rdfs:label "Priority" ;
   rdfs:comment "The priority of a task" ;
   owl:sameAs <http://acme.example.com/ns/alm#priority> .

<server1UriForHigh>
   rdfs:label "High" ;
   rdfs:comment "High priority task" ;
   owl:sameAs <http://acme.example.com/ns/alm#P1> .

...

and

<server2UriForUrgency>
   rdfs:label "Urgency" ;
   rdfs:comment "The urgency of a task" ;
   owl:sameAs <http://acme.example.com/ns/alm#priority> .

<server2UriForP1>
   rdfs:label "P1" ;
   rdfs:comment "Do these things first" ;
   owl:sameAs <http://acme.example.com/ns/alm#P1> .

Example representations 1 - enumeration value URIs are variable

Tool developers have considered it is likely that the user might change their mind over the representation of priority values, but not the URI for the priority property itself. This case would also apply if the property URI itself was a standard OSLC or jazz.net URI, and just the enumeration values were user-defined.

Artifacts in project 1 would have these (partial) representations:

<cr1> a oslc_cm:ChangeRequest ;
   <http://acme.example.com/ns/alm#priority> <server1UriForHigh> .

<cr2> a oslc_cm:ChangeRequest ;
   <http://acme.example.com/ns/alm#priority> <server1UriForLow> .

Artifacts in project 2 would have these (partial) representations:

<cr8> a oslc_cm:ChangeRequest ;
   <http://acme.example.com/ns/alm#priority> <server2UriForP2> .

<cr9> a oslc_cm:ChangeRequest ;
   <http://acme.example.com/ns/alm#priority> <server2UriForP1> .

Where inferencing is not available, an explicit SPARQL query for all high priority items across both projects is:

   SELECT ?item 
   WHERE {
      ?item <http://acme.example.com/ns/alm#priority> ?enum .
      ?enum owl:sameAs|^owl:sameAs <http://acme.example.com/ns/alm#P1> .
   }

If inferencing were available, the following query would be sufficient:

   SELECT ?item 
   WHERE {
      ?item <http://acme.example.com/ns/alm#priority> <http://acme.example.com/ns/alm#P1> .
   }

Example representations 2 - property predicate URI and enumeration value URIs are variable

Tool developers have considered it is likely that the user might change their mind over the external URI for both the priority values and the priority property itself.