XHTML metainformation profile

This document, http://purl.org/net/ns/metaprof, is a metadata profile for XHTML, as well as an RDDL namespace document for GRDDL transformation. (Note: please use this profile for Well-formed XHTML (not HTML4 etc.) because GRDDL agent will want to try XSLT transformation.)

[J]この文書http://purl.org/net/ns/metaprofは、XHTML文書内のメタデータ解釈を定義したプロファイルです。同時に、GRDDLの仕組みを利用して、XHTML文書からXSLTでメタデータ(RDFグラフ)を抽出する機能も備えています。このプロファイルURIをhead要素のprofile属性値として記述することで、いかに定義するrel属性値やclass属性値の意味を共有し、簡単にRDFを提供することができます(「名前のウェブとXHTML文書のプロファイル」での説明も参照してください)。

Profile for XHTML link types

This document is a metadata profile in the sense of the HTML specification, in section 7.4.4.3 Meta data profiles, as refined by XMDP.

If the attribute rel on a hyperlink ('a' element or 'link' element) has one of the values listed bellow, the target of the link (the resource referenced by the corresiponding href attribute) shall have the following semantics:

nofollow
Refers to a document whose contents do not necessarily follow in any way from the topic or themes of the current document. This type of relationship is typically expressed within a document that includes hypertext content whose origin is unknown or untrusted. (based on suggestion by Dan Brickley.)
Search engins may use this relation so as not to give any credit in their search results, as per 'Preventing comment spam' in google blog.
license
Link target represents the type of "license" of the document, in the sense defined in Extending Creative Commons Metadata.
openid.server
Link target is an OpenID Identity Provider to authenticate the document owner's OpenID Identity URL (i.e. the document's URI) , as per section 3.1 of OpenID Authentication 1.1. Identity URL may not necessarily be authenticated directly, rather may be delegated to a different URL specified by an openid.delegate link.
openid.delegate
Link target is a delegated Identity URL on behalf of the document owner, which would be authenticated via an OpenID Identity Provider specified by an openid.server link, as per section 3.1.1 of OpenID Authentication 1.1.

Also, a link element with following rel/rev attribute value specifies a relationship with the target as:

meta
with rel="meta", the target is an RDF metadata file for the document.
made
with rev="made", the target is an e-mailbox of the author of the document.

Mapping to pupular vocabularies

In this definition, the following namespace prefix declaration is assumed:

  xmlns:foaf="http://xmlns.com/foaf/0.1/"
  xmlns:dcterms="http://purl.org/dc/terms/"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:wn="http://xmlns.com/wordnet/1.6/"
  xmlns:cc="http://web.resource.org/cc/"
  xmlns:kw="http://purl.org/net/ns/wordmap#"
  xmlns:sitemap="http://purl.org/net/ns/sitemap#"
topic
Link target is a "foaf:topic" of the document, or other entity described in the document.
source
Link target is a "dc:source" of the document, or other entity described in the document.
Link target is the value of "dc:rights" of the document.
rights
Link target is a "dc:source" of the document, or other entity described in the document.
subject
Link target is a "dc:subject" of the document, or other entity described in the document.
creator
Link target is a "dc:creator" of the document, or other entity described in the document.
contributor
Link target is a "dc:contributor" of the document, or other entity described in the document.
publisher
Link target is a "dc:publisher" of the document, or other entity described in the document.
description
Link target is a "dc:description" of the document, or other entity described in the document.
coverage
The document, or other entity described in the document has "dc:coverage" of the link target.
references
The document, or other entity described in the document "dcterms:references" the link target.
section
Link target is a "sitemap:section" (child node of sitemap tree that has more paths) of the document, or other entity described in the document.
work
Link target is a "sitemap:work" (leaf node of sitemap tree) of the document, or other entity described in the document.
hasPart
The document, or other entity described in the document "dcterms:hasPart" the link target.
isPartOf
Link target is a "dcterms:isPartOf" of the document, or other entity described in the document.
alternate
If hreflang attribute presents, link target is a "dcterms:hasVersion" of the document. Else if type attribute contains +xml, link target is an "rdfs:seeAlso" of the document.
tag
If used with a element, the document, or other entity described in the document kw:keyword whose lexical representation is the content of this element. Also, the link target of this element has some dc:relation with the keyword.

Prefixed Attribute Values

Following dot(.) prefix + name combination is treated as a term from corresponding namespace.

foaf.
The name comes from FOAF vocabulary name space (http://xmlns.com/foaf/0.1).
dc.
The name comes from Dublin Core Metadata Elements Set name space (http://purl.org/dc/elements/1.1/).
dcterms.
The name comes from Dublin Core Terms name space (http://purl.org/dc/terms/).

In addition, this profile specifies "schema." prefixed attribute values by referencing section 4 of RFC2731.

schema.pfx
associates an element name prefix (such as pfx.creator) with the reference definition of the element set that it identifies (value of href attribute). This profile extends the scope of this prefix to rel and class attributes in any element, as well as name attributes of meta elements.

Names for meta element

If the name attribute of a meta elemet has one of the bellow, the content attirbute value will be interpreted as follows.

description
The content value will be dc:description of the document.
keywords
The content value is interpreted as an entity with http://www.kanzaki.com/ns/keyword/ namespace. That entity will be kw:keyword of the document. If the content value is separated by comma(s), these are treated as independent triples.
author
An entity for the document's author will be generated. The value of content will be foaf:name of the entity.

Metadata extraction via GRDDL

This profile also defines XSLT stylesheet in order to extract RDF metadata from an XHTML document using GRDDL.

In addition to the above defined rel attribute values, this profile defines the following mapping from elements or class attribute values to RDF vocabularies.

dfn element
class="subject"
If a <dfn> element or an inline element with class="subject" is present, the content of the element is interpreted as an entity with http://www.kanzaki.com/ns/keyword/ namespace. That entity will be kw:keyword of the document, or other entity described in the document.
class="abstract"
If a <p> element with class="abstract"is present, dc:description will be generated.
class="created | modified"
If an inline element with class="created" (class="modified") is present, dcterms:created (dcterms:modified) will be generated.
class="creator | contributor | publisher"
If an inline element with a class value either of creator, contributor or publisher is present, corresponding Dublin Core property with an object entity whose rdfs:label is the content of the element will be generated.
class="author | coverage | date | description | format | identifier | rights | title"
If an inline element with a class value either of author, coverage, date, description, format, identifier or title is present, corresponding Dublin Core property with the content of the element as its literal object will be generated (author will become dc:creator).
class="vcard | vevent | hreview"
If a block-level element with a class value either of vcard, vevent or hreview is present, the element and its descendants class attributes will be treated as corresponding microformat, and FOAF, RDFCalendar or Review vocabulary RDF graph will be generated.
class with prefix
If an element has a class attribute whose value is a form either dc.*, dcterms.* or foaf.* (e.g. class="dc.description"), the content of the element will be the literal value of dc:*, dcterms:* or foaf:* respectively (where * denotes propety local name of each vocabulary).
rel with prefix
If a <link> or an <a> element has a rel attribute with a form either dc.*, dcterms.* or foaf.* (e.g. rel="foaf.topic"), it will be mapped to the designated property and the value of href attribute will be its object URI.
schema prefixed value
If a <link> element declares a schema with the form defined in RFC2731, the schema can be the prefix of above mentioned class and rel values.

A class/rel attributes within body element is treated as a property of the document, the author of the document, or other typed node (which is a foaf:topic of the documet), according to the following rules.

block level elements
If a block-level element has a 'prefixed' class, it will become a typed node, and be related with the document resource by foaf:topic. If the class of a block-level element is a non-prefixed but begins with a capital letter, it will be treated as an Wordnet Class (e.g. class="Book" will be type wn:Book)
inline element
If parent block-level element has above class, description of an inline element will be a property of the typed node. Otherwise, it will be a property of the document resource.
class="me"
If a block-level element has class="me", description within that element will be properties of the auther of the document, rather than document itself.

Usage

Put this profile URI in the profile attribute of head element, as:

<head profile="http://purl.org/net/ns/metaprof">

pre defined namespaces

For example, from the following XHTML with this profile and rel attribute,

Example:

<html xmlns="http://www.w3.org/1999/xhtml">
 <head profile="http://purl.org/net/ns/metaprof">
 ...
 </head>
 <body>
 ...
 <p>With <a rel="topic" href="http://www.w3.org/RDF/">RDF</a>, we can describe...</p>
 ...

GRDDL will generate an RDF/XML something like this:

Example:

<rdf:RDF ...>
 <rdf:Description rdf:about="">
  <foaf:topic rdf:resource="http://www.w3.org/RDF/"/>
  ...
 </rdf:Description>
</rdf:RDF>

schema prefix defined with link element

You can use predefined prefix (foaf., dc., dcterms.) or any schema namespaces by declaring prefix binding with link elements as defined in RFC2731, and put them in class attributes as 'dot prefixed' names (this is an extension to RFC2731 by this profile). Also, a class on a block element whose value start with a capital letter will be considered a WordNet Class name.

Example:

<html xmlns="http://www.w3.org/1999/xhtml">
 <head profile="http://purl.org/net/ns/metaprof">
  <link rel="schema.prism" href="http://prismstandard.org/namespace/1.2/basic/" />
 ...
 </head>
 <body>
 ...
 <p class="Book">
 I wrote <cite class="title">An Introduction to RDF/OWL</cite>
 in <span class="prism.publicationDate">2005-01</span>
 (<span class="publisher">Morikita Shuppan Co., Ltd.</span>)
 which explains ....
 </p>
 ...

GRDDL extracts the fllowing RDF/XML fragment from above XHTML. Note classes of block-level elements (such as p, div, ul, dl) will be nodes' types rather than properties, and foaf:topic will automatically relate these typed nodes and the document (in Japnese with some English summaries). Note you do not neet to declare namespace URIs for foaf., dc., wn.

Example:

<rdf:RDF ...>
 <rdf:Description rdf:about="">
  ...
  <foaf:topic>
   <wn:Book>
    <prism:publicationDate>2005-01</prism:publicationDate>
    <dc:title>An Introduction to RDF/OWL</dc:title>
    <dc:publisher>
     <foaf:Agent>
      <foaf:name>Morikita Shuppan Co., Ltd.</foaf:name>
     </foaf:Agent>
    </dc:publisher>
   </wn:Book>
  </foaf:topic>
  ...
 </rdf:Description>
</rdf:RDF>

If a block element has two schema prefix class names, then first one will be a property that relates the document and typed node whose Class will be the second token (first token should begin with lowercase, while second with Uppercase). So,

(例)

<p class="prism.isTranslationOf wn.Book">
I wrote <cite class="title">An Introduction to RDF/OWL</cite>
in <span class="prism.publicationDate">2005-01</span>
(<span class="publisher">Morikita Shuppan Co., Ltd.</span>)
which explains ....
</p>

will become

(例)

...
<prism:isTranslationOf>
 <wn:Book>
  <prism:publicationDate>2005-01</prism:publicationDate>
  <dc:title>An Introduction to RDF/OWL</dc:title>
  <dc:publisher>
   <foaf:Agent>
    <foaf:name>Morikita Shuppan Co., Ltd.</foaf:name>
   </foaf:Agent>
  </dc:publisher>
 </wn:Book>
</prism:translationOf>
...

rather than related with foaf:topic. Note that the second token must be dot prefixed (no automatic transformation to wn: class).

Consult Embedding metadata in XHTML and extracting them as RDF for detailed documentation.