Registry Interchange Format - Collections and Services

Schema Guidelines

Last Updated: 10 July 2009

Table of Contents

  1. Purpose
  2. Overview
  3. Example
  4. Schema
  5. Schema Elements and Usage
  6. Frequently Asked Questions
  7. Guidelines Revision History

Purpose

This document describes the use of the Registry Interchange Format - Collections and Services (RIF-CS) Schema for the purposes of exposing collections metadata via an OAI-PMH Data Provider to a collections registry. The document is aimed at ANDS Collections Registry data providers, potential community users in Australia and usage within a global registries context.

This document is aimed primarily at a technical audience namely those developing the OAI-PMH Data Provider code to generate material in RIF-CS format or those creating mappings from some native form to RIF-CS XML.

This document provides general direction for RIF-CS schema use but also provides specific guidance to particular RIF-CS user communities. Specific guidance is clearly marked as such.

Overview

The RIF-CS Schema was developed as a data interchange format for supporting the submission of collections metadata to the ANDS Collection Registry. It is based on ISO2146 but only includes elements needed for a collection service registry and so is not full binding to the standard.

It is recommended that the RIF-CS user community provide feedback in order the schema can evolve to meet a wider needs base. The schema also has an accompanying set of vocabularies.

Currently the primary registry object type is collections. A collection in the RIF-CS Schema context could be a repository, a registry, a collective work or an index/database. There are no hard and fast rules about what constitutes a collection and it is up to the data providers to consider what their collections are and what metadata is provided. The RIF-CS also supports other registry object types, namely services, activities and parties. Any or all of these along with their relations to each other are able to be expressed in RIF-CS format. The relations currently supported by the format are illustrated in Figure 1. Adopters of the RIF-CS format are encouraged to identify new relations needing to be supported.

Example

This example of a repository describing its holdings in RIF-CS will not be discussed but may be used for reference when reviewing the next section.

Schema

This section describes in detail the use of each schema element within the registry interchange/repository harvest context. For content model and restrictions (optional, mandatory, recurrence) refer to the schema documentation and controlled vocabularies.


Schema Elements and Usage

accessPolicy | activity | address | addressPart | arg | collection | description | description (relation) | electronic | identifier | key | location | name | namePart | originatingSource | party | physical | registryObject | registryObjects | relatedInfo | relatedObject | relation | service | spatial | subject | url | value


Element <registryObjects>

This element is the root element for any RIF-CS compliant document.

May contain: registryObject


Element <registryObject>

This element is a wrapper element containing descriptive and administrative metadata for a single registry object.

May contain: activity | collection | party | service

Contained within: registryObjects

Attributes:

group: required

This must contain a value that uniquely identifies the organisation that is contributing this object's metadata. It should be a plain text string set to the name of the contributing organisation, e.g. The Australian National University.


Element <originatingSource>

An element holding a string or URI identifying the entity holding the managed version of the registry object metadata. For example in a federated aggregation context this must identify the original repository or owning institution from which the metadata was harvested *not* the aggregator from which it was harvested.

type: optional

A value taken from the Originating Source Type vocabulary.


Element <key>

This element holds a value uniquely identifies the registry object.

ANDS Collections Registry providers: In the Collections Registry context, the key must be a globally unique identifier.

Contained within: registryObject | relatedObject


Element <activity>

Wrapper element for descriptive and administrative metadata for an activity registry object.

May contain: description | identifier | location | name | relatedInfo | relatedObject | subject

Contained within: registryObject

Attributes:

type: required

A value taken from the Activity Type vocabulary.

In the example, Australian Partnership for Sustainable Repositories is a program and Online Research Collections Australia (ORCA) is a project that it funds.

dateModified: optional

The date this object's metadata was last changed. This only refers to the metadata of the registry object itself. For example if a collection has a new item added to it this does not constitute a modification to the object.


Element <collection>

Wrapper element for descriptive and administrative metadata for collection registry object.

May contain: description | identifier | location | name | relatedInfo | relatedObject | subject

Contained within: registryObject

Attributes:

type: required

A value taken from the Collection Type vocabulary.

In the examples, Aboriginal Population Profiles for Development Planning in the North East Kimberley is a collective work, DSpace at The Australian National University is a repository, the ARROW Discovery Service is a catalogue or index and the ORCA Collection Service Registry is a registry.

dateAccessioned: optional

The date this object was registered in a managed environment. Must be UTC and of one of the forms described in section 3.2.7 of the W3C's Schema Data Types document (http://www.w3.org/TR/xmlschema-2/).

dateModified: optional

The date this object's metadata was last changed. This only refers to the metadata of the registry object itself. For example if a collection has a new item added to it this does not constitute a modification to the object. Where an object's metadata has not changed, this attribute should be set to the object's creation date.


Element <party>

Wrapper element for descriptive and administrative metadata for party registry object.

May contain: description | identifier | location | name | relatedInfo | relatedObject | subject

Contained within: registryObject

Attributes:

type: required

A value taken from the Party Type vocabulary.

In the examples Scott Yeadon is a person and The Australian National University is a group.

dateModified: optional

The date this object's metadata was last changed. This only refers to the metadata of the registry object itself. For example if a collection has a new item added to it this does not constitute a modification to the object.


Element <service>

Wrapper element for descriptive and administrative metadata for service registry object.

May contain: accessPolicy | description | identifier | location | name | relatedInfo | relatedObject | subject

Contained within: registryObject

Attributes:

type: required

A value taken from the Service Type vocabulary.

Construct service type as a two-part string, with the first part specifying the service genre and the second part specifying the protocol (e.g., syndicate-rss, harvest-oaipmh, search-sru).

A value taken from the Service Type vocabulary.

Note: the value for the service genre is taken from the set of service genres registered with the e-Framework. The protocol is taken from known services identified by initial Collections Registry data providers. New genre-protocol combinations may be added on application to the RIF-CS schema manager.

dateModified: optional

The date this object's metadata was last changed. This only refers to the metadata of the registry object itself. For example if a collection has a new item added to it this does not constitute a modification to the object.


Element <identifier>

Primary and alternative identifiers for this object. The value of the repositoryObject's <key> element may be repeated if desired, else any other identifiers used to reference this object.

Contained within: activity | collection | party | service

Attributes:

type: required

A value taken from the Identifier Type vocabulary.


Element <name>

A wrapper element for name metadata allowing the segmentation of name components by use of the namePart element.

ANDS Collections Registry providers: Although the name element is optional it should be treated as mandatory when providing metadata to the Collections Registry.

May contain: namePart

Contained within: activity | collection | party | service

Attributes:

type: optional

A value taken from the Name Type vocabulary.

Use 'type' when more than one name is supplied and there is a need to distinguish between the primary name and alternative versions.

There should only be one primary name but it can be encoded in multiple language statements. To record versions of an official name in multiple languages, use multiple instances of name with type set to 'primary and in each statement associate the value string with the appropriate language using the xml:lang attribute.

dateFrom: optional

The date from which the name was current. This is only applicable where the name has changed over time and older versions of the name have been recorded in the metadata. Should be UTC and of one of the forms described in section 3.2.7 of the W3C's Schema Data Types document (http://www.w3.org/TR/xmlschema-2/).

dateTo: optional

The date the name became no longer current. This is only applicable where the name has been changed over time and older versions of the name have been recorded in the metadata. Should be UTC and of one of the forms described in section 3.2.7 of the W3C's Schema Data Types document (http://www.w3.org/TR/xmlschema-2/).


Element <namePart>

A name can be represented by either a single namePart (as in the case of an organization or group) or may be split into a more granular structure (as in the case of a person) through use of multiple namePart elements (e.g. for title, first name, surname, etc).

ANDS Collections Registry providers: Splitting a name into its constituents is not mandatory, however if split, each namePart of type family, given and initial must only contain a single value and the individual namePart elements must be ordered as they would appear in written form. For a single name constituent provide a type initial or type given within a single name element but not both.

Contained within: name

Attributes:

type: optional

A value taken from the Name Part Type vocabulary.

Use 'type' when there is more than one namePart and there is a need to distinguish between the multiple namePart instances that together make up a name.


Element <location>

May contain: address | spatial

Contained within: activity | collection | party | service

Wrapper element containing metadata describing location(s) relevant to the registry object. A location element should contain metadata about a single location (e.g. home location, work location or term location for a person; business location, invoicing location and delivery locations for an organisation; service location for a service; physical location for a repository).

NOTE: Do not use location to record the repository in which a collective work is held. Instead use an instance of related object with relation type set to "isLocatedAt".

type: optional

The type of location being described. A value taken from the Location Type vocabulary.

ANDS Collections Registry providers: If the type is omitted the location of the object being described is assumed. If a type of coverage is used the dateFrom and dateTo attributes may be used to indicate temporal coverage. For values which do not meet the requirements of the date range attributes, use a description element with a type of temporal.

dateFrom: optional

The date from which the location information was current. This is only applicable where the address has changed and older addresses have been recorded in the metadata being provided. Should be UTC and of one of the forms described in section 3.2.7 of the W3C's Schema Data Types document (http://www.w3.org/TR/xmlschema-2/).

dateTo: optional

The date from which the location information was no longer current. This is only applicable where the address has changed and older addresses have been recorded in the metadata being provided. Should be UTC and of one of the forms described in section 3.2.7 of the W3C's Schema Data Types document (http://www.w3.org/TR/xmlschema-2/).


Element <address>

Wrapper element for physical and electronic address metadata.

May contain: physical | electronic

Contained within: location


Element <physical>

Describes the physical address of the object. This element acts as a wrapper for one or more addressPart elements.

May contain: addressPart

Contained within: address

Attributes:

type: optional

A value taken from the Physical Address Type vocabulary.


Element <addressPart>

This element holds either a full or partial address. Multiple addressPart elements may be used to divide the full address into meaningful fragments (e.g. street address, postcode, country).

Contained within: physical

Attributes:

type: required

A value taken from the Address Part Type vocabulary or:

Australian providers: AS 4590 - Interchange of Client Information.

Use addressLine for legacy or non-specific address information or map addressPart type for each element of structured address information to a more specific value from AS 4590 - Interchange of Client Information, using the following types:

When using AS 4590 values for addressPart type, include flatOrUnitNumber and floorOrLevelNumber in buildingOrPropertyName if the data cannot easily be parsed. Similarly, include houseNumber or lotNumber in streetName and postalDeliveryNumberPrefix and postalDeliveryNumberSuffix in postalDeliveryNumberValue if the data cannot easily be parsed.

Encode Australian stateOrTerritory in addressPart as AS 4590 compliant values, i.e. ACT | NSW | NT | QLD | SA | TAS | VIC | WA.


Element <electronic>

Wrapper element holding metadata describing the electronic address of the object. An electronic address will generally hold a URI pointing to the object being described. However in the case of a service it is possible to describe the service in terms of its base URL using the <value> element and using the <arg> element to describe the service arguments. A separate collection object which supports the service would then provide a URL to its implementation of the service in its <url> element when describing the <relatedObject>.

May contain: arg | value

Contained within: address

Attributes:

type: optional

A value taken from the Electronic Address Type vocabulary.

Where 'wsdl' is specified, the content of the value element must be a URL to the WSDL file.


Element <value>

Element holding a URI representing the electronic address of the object. For collection, activity and party objects this will typically contain a URI. For services it will likely contain the base URL of the service point for HTTP services with the <arg> element(s) describing the service argument(s). Alternatively a service object could use this element to provide a URL to a WSDL file.

Contained within: electronic


Element <arg>

The arg element is used to describe the arguments for an electronic service. In a Collections Registry context this element must not be used when describing activity, collection or party objects.

Contained within: electronic

required: required

Indicates whether the argument is required (true) or optional (false).

type: required

A value taken from the Arg Type vocabulary.

use: required

A value taken from the Arg Use vocabulary.


Element <spatial>

Holds geographical address information such as co-ordinates or region information.

Contained within: address

Attributes:

type: required

A value taken from the Spatial Type vocabulary.

If this information is encoded in a markup langauge (i.e. gml, gpx and/or kml) a URL pointing to this information must be provided.

Australian providers: For all registry objects located in Australia, where an address is not available include an ISO 3166-2 occurrence of the <spatial> element with the value 'AU' and, when the state is known, an ISO 3166-2 occurrence of the <spatial> element with the appropriate state code:

AU-ACT | AU-NSW | AU-NT | AU-QLD | AU-SA | AU-TAS | AU-VIC | AU-WA


Element <relatedObject>

Wrapper element containing metadata describing the relationship of a registry object related to the object currently being described.

May contain: key | relation

Contained within: activity | collection | party | service


Element <relation>

A wrapper element describing the current registry object's relationship to another registry object.

May contain: description | url

Contained within: relatedObject

Attributes:

type: required

A value taken from one of the Activity/Collection/Party/Service Relation Type vocabularies.

Where a hasAssocationWith relation is specified an accompanying description should outline the details of the association.

In an Activity context hasPart and isPartOf must only be used with activities of the same type. For example, APSR is a program which runs other programs with which it will have hasPart/isPartOf relations. These programs have a set of projects they fund which are explicit funding relations. For example the Development Program's relationship with the ORCA Project would be expressed as a Funds/isFundedBy relation.


Element <url>

A URI expressing or implementing the relationship between registry objects. For example if describing a collection's relation to a service, the URL which implements the related service in the collection's context can be represented in this element.

Contained within: relation


Element <description>

A plain text description of a registry object.

Contained within: activity | collection | party | service

Attributes:

type: required

A value taken from the Description Type vocabulary.


Element <description> (relation)

A plain text description further refining or describing a relation.

Contained within: relation


Element <relatedInfo>

Any URIs pointing to information related to the collection, party, activity or service. For example a web site providing contextual information about a collection.

Contained within: activity | collection | party | service


Element <subject>

A term or phrase representing the primary topic(s) on which a registry object is focused.

Use subject to associate activities with the field of activity, collection with the subject matter of items in the collection and party with field of activity or occupation. Services can be assigned subjects but may also be associated with a topic through the collections which support them.

Contained within: activity | collection | party | service

Attributes:

type: required

The name of an authoritative list from which the subject term or phrase is taken or that governs its form, taken from the Library of Congress Source Codes for Subjects.

Use 'local' for other controlled lists not yet registered with LC, but consider registering any lists likely to be used widely within a given community or domain.

Use 'local' for uncontrolled terms.

Australian providers: Use 'anzsrc-toa', 'anzsrc-for' and/or 'anzsrc-seo' for the Australian and New Zealand Standard Research Classification pending its registration with LC.


Element <accessPolicy>

A URL describing service access policies. This could be a web site or XACML resource for example.

Contained within: service


Frequently Asked Questions

Why RIF-CS?


Why is there duplication of content models in each of the activity, collection, party and service schema fragments?

It was envisaged that individual content models for these common structures would evolve over time so are currently duplicated across registry objects to potentially ease maintenance of the schema and reduce major structural changes to the schema.


How do I mark up the relationship between a collection and its related service?

For argument's sake, assume a repository offers an RSS feed as a service. Assume also the RSS service can be offered for a single collection by providing the collection identifier as an argument to that service. The service object metadata would contain the description of how to use this service as the following fragment shows:

<registryObject group="my-group">
   <key>au.edu.myuni.myrepo.RSS2.0</key>
  .....
    <service type="syndicate-rss">
  .....
    <name type="primary">
     <namePart>RSS 2.0 Feed</namePart>
    </name>
    <location>
      <address>
         <electronic type="url">
            <value>http://myrepo.myuni.edu.au/feed/rss_2.0</value>
            <arg required="true" type="string" use="keyValue">identifier</arg>
         </electronic>
    </location>
  .....
  </service>
</registryObject>

The collection object could then describe its relationship to this service through its relatedObject element as follows:

<collection type="collection">
  .....
   <relatedObject>
      <key>au.edu.myuni.myrepo.RSS2.0</key>
      <relation type="supports">
         <description>Notification of latest 20 items added to this collection</description>
         <url>http://myrepo.myuni.edu.au/feed/rss_2.0?identifier=1030.58/19896</url>
      </relation>
   </relatedObject>
  .....
</collection>


Guidelines Revision History

10 July 2009