r4 - 2015-10-14 - 00:31:35 - Main.ndjcYou are here: TWiki >

LinkedData Web > BestPractices > UseFoafForPeopleAndAgents

This wiki: The development wiki is a work area where Jazz development teams plan and discuss technical designs and operations for the projects at Jazz.net. Work items often link to documents here. You are welcome to browse, follow along, and participate. Participation is what Jazz.net is all about! But please keep in mind that information here is "as is", unsupported, and may be outdated or inaccurate. For information on released products, consult IBM Knowledge Center, support tech notes, and the Jazz.net library. See also the Jazz.net Terms of Use.

Any documentation or reference material found in this wiki is not official product documentation, but it is primarily for the use of the development teams. For your end use, you should consult official product documentation (infocenters), IBM.com support artifacts (tech notes), and the jazz.net library as officially "stamped" resources.

Best Practice: Use FOAF to Identify People and Agents

State: Approved

Contact: Nick Crossley

Scope

Many RDF resource representations include information (encoded as triples) about people or agents (e.g. build engines). For example, many resources use the Dublin Core property dcterms:creator to identify who created the resource. This recommendation describes a Best Practice for what to use as the object of triples that identify people or agents.

The Friend of a Friend (FOAF) vocabulary defines types and properties for people and agents. FOAF is widely used on the Semantic Web and is recommended by the Open Services for Lifecycle Collaboration (OSLC) Core specification. We therefore recommend its use by our products.

The adoption of this Best Practice will make it easier for users to write queries across products, e.g. to find all resources created by a given person.

Recommendation

Use HTTP URIs

Linked Data design principles state that HTTP URIs should be used to identify things, and that these URIs should be dereferenceable via HTTP GET. People, and agents (e.g. build engines) SHOULD therefore be identified with HTTP URIs.

URIs that identify people and agents MUST be handled in accordance with web Best Practices (see Cool URIs for the Semantic Web). People and agents are examples of real-world objects. They are not themselves web documents, aka information resources. When a server receives an HTTP GET request for the URI of a real-world object it SHOULD return the representation (e.g. in HTML or RDF) of an information resource that provides information about that real-world object. This can be achieved through the use of so-called hash-URIs or 303 redirects. The key point here is that a real-world object (e.g. a person or agent) has a different URI than its associated information resource (e.g. a user account). Failure to make this distinction can lead to nonsensical RDF statements such as asserting that a person has been archived or modified.

Use External URIs If Available

Our first preference is to identify users with an external URI that will be the same across every application and every Jazz (or non-Jazz) installation. For example, some public or corporate social networking application might issue suitable URIs to people. However, this is not how our products work today, so this is more of a longer-term direction. Lacking either a) the availability of an external user URI and/or b) the ability of our products to use one, the first fallback position is to use a Jazz Team Server (JTS) URI allocated by the local Jazz instance. For any new application that does not currently support URIs for users, the goal should be to move to external global user URIs, with the local JTS User URI being only an optional service for allocating global URIs for users that don't have one handy.

Use Blank Nodes When No URIs Are Available

In addition to the absence of external user URIs, we have the need to support Jazz deployments that contain more than one JTS, as well as products like DOORS that do not use JTS.

If there is no URI available to a product, then the RDF representation of any information resource (e.g. a work item) that refers to a person or agent (e.g. via dcterms:creator) must use a blank node and include the applicable FOAF type and properties to describe the person or agent. This is a widespread practice in the Linked Data community.

Use FOAF

The use of FOAF is not limited to blank nodes. If a dereferencable URI is available then its RDF representation must include the applicable FOAF type and properties. If a product uses a URI instead of a blank node, then the URI must be dereferenceable via HTTP GET and its representation must use the FOAF vocabulary (details below). The information resources that describe users must be indexable through Tracked Resource Set (TRS) so that queries can make use of the user name and potentially other information [how this will work for external URIs is an issue; a TRS adapter service might be required].

Use `foaf:mbox` as an Identifier

Since, in general, there may be many or no URIs for a person, we need a reliable way to identify people in queries. If we always provide FOAF properties, then they can be used to identify people. The natural choice of given name, e.g. "Martin Nally", has the flaw that it may not be unique within an organization. A better alternative is to use email addresses, e.g. "nally@us.ibm.com" since most organizations prohibit email address sharing. FOAF defines foaf:mbox for this purpose. e.g. the following SPARQL query:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?x
WHERE {
   ?x foaf:mbox <mailto:nally@us.ibm.com>
}

binds ?x to all resources that identify Martin Nally. foaf:Person representations MUST include a value for foaf:mbox and foaf:name, and SHOULD provide a value for foaf:givenName and foaf:familyName.

In some public contexts, email addresses are considered to be private information, e.g. to thwart spammers. FOAF provides the property foaf:mbox_sha1sum for this purpose. The object of this property is the SHA1 hash of the email address owned by the person. In our case, we restrict access to resources and expose email addresses so using SHA1 is probably not necessary. Our products require authentication so agents that mine websites for email addresses for spammers are not an issue. After authentication, email addresses are exposed by the web UI in rich hovers. Using SHA1 therefore provides no extra value. Consequently, RDF representations must include foaf:mbox and NOT foaf:mbox_sha1sum.

Use `foaf:Person` or `foaf:Agent`

If the user is a person, its representation should include a foaf:Person type triple. If the user is not a person (e.g. a build agent) then it should include a foaf:Agent type triple. Note that foaf:Person is a subclass of foaf:Agent. An agent could be a group, organization, or software program. Just as for people, the RDF representation of agents MUST include foaf:name and foaf:mbox.

Summary Guidelines for Products

The Best Practice design is to identify users (or agents) with dereferenceable HTTP URIs. These user resources should have the applicable FOAF type and properties, and should be indexed by LQE so they are available for queries.

In the absence of a deferenceable URI, the representation should use a blank node that contains the applicable FOAF type and properties.

In queries, foaf:mbox SHOULD be used as the preferred way to identify users across products since different products may use different URIs or blank nodes. Use foaf:name for both people and agents. Use foaf:givenName and foaf:familyName for people.