Data Source Account Settings

The Data Source Account Settings page in the RDA Registry allows you to view and configure the options in your data source account.

Account Administration Information

These settings contain basic information about your account. It is important to keep this information up to date so that ANDS can easily get in contact with you in the event of system changes/planned outages.

Key

A unique ID identifying your Data Source within the ANDS systems. This key is allocated by ANDS and cannot be changed once the account has been created. The key is for your Data Source, and is not the same as the key each individual metadata record will have.

Example: aims.gov.au

Title

The name of your Data Source. This may be an organisation name or a repository name. The title will be used to identify your data source account throughout the registry pages.

Example: Australian Institute of Marine Science Metadata Catalogue.

Record Owner

The name of the owner of this Data Source administrative record, usually an organisation name that is not ambiguous, i.e. the full organisation name. Acronyms, contractions or nicknames are not permitted.

Example: Australian Institute of Marine Science

Contact Name

Personal name of the Data Source Administrator that can be disambiguated.

Example: John T. Smith

Contact Email

Organisational email address of the Data Source Administrator.

Example: name@uni.edu.au

Notes

Any information useful to record about the Data Source or its activities.

Records Management Settings

The Records Management Settings allows you to configure how your records are processed and managed by the registry.

The Reverse Links functionality has been incorporated in the RDA Registry and Research Data Australia to assist you in creating and managing relationships to and from your records. The two Reverse Links options are explained below. More information about related objects and reverse links

Note: The inferred reverse links are created only for display and validation purposes and cannot be edited. You can see them in Research Data Australia, and in the Registry view from Research Data Australia, but not in XML data or manual data entry screens.

Internal Links

The 'Allow reverse internal links' option offers you a way of achieving reciprocal two-way links between the records in your data source when the relationships are only defined in the one direction.

Example: Relationships are defined from your collection records to your party records, but the relationships are not defined from your party records back to your collection records.




By enabling the 'Allow reverse internal Links' option, Research Data Australia and the RDA Registry will automatically infer and display the reciprocal relationship links when your records are viewed.



External Links

The 'Allow reverse external links' option lets you mange how links to your records, from records in other data sources are managed.
Example: A Relationship is defined from another organisations collection record to one of your collection records, but the relationship is not defined from your collection record back to the other organisations collection record.




By enabling the 'Allow reverse external links' option, Research Data Australia and the RDA Registry will automatically infer and display the reciprocal relationship link to the external records, when your records are viewed. The option is disabled in your account by default.




return to menu

Create Primary Relationships

These settings allow you to enter up to two keys of published records from within your own data source that will be automatically related to all records in your data source. This simplifies the process of, for example, linking all collections to your institution's party record.

To create relationships to a primary record:

  1. Enable the 'Create Primary Relationships' option. Two sets of fields will be displayed.
  2. Enter the key of a published record from within your data source that you would like as a primary record, in the first 'Primary Record Key' field.
  3. If you don't know the key you can access a search widget by clicking on the magnifying glass button . Upon selecting a record in the search widget the key will automatically be entered into the 'Primary Record Key' field.
  4. After entering a primary record key, you now need to define the relationships the records in your data source will have to your primary record.
  5. Using the drop down fields shown beside each class type, select the relationship that class type will have to your primary record.
  6. Repeat steps 1-5 to configure a 2nd primary record, or click the 'Save' button to save your settings and generate the relationships.

Note: The primary relationships generated by the system are only for display and validation purposes and cannot be edited. You can see them in Research Data Australia, and in the Registry view from Research Data Australia, but not in XML data or manual data entry screens.

return to menu

Manually Publish Records

The 'Manually Publish Records' option offers you a way of preventing your records from being immediately published to Research Data Australia. Enabling the option will display the 'Approved' status table on the Manage Records page and you can then manually publish your approved records at a suitable time. The 'Manually Publish Records' option is disabled in your data source account by default.

The 'Manually Publish Records' flag is also useful for providers wishing to test their harvest and view the records in the RDA Registry. By enabling the option, your harvested records will be held at a status of 'Approved' and will not be visible in Research Data Australia. You can then either, delete the records, disabled the option and re-harvest, or publish the records manually via the Manage Records page.

return to menu

Quality Assessment Required

This option is set by ARDC registry staff and cannot be edited by Data Source Administrators. Where the option is enabled, all manually created, edited and harvested records will need to be published through the Quality Assessment workflow. Once ARDC believes that you have a good understanding of RIF-CS and best practices, the option will be disabled allowing any future records to be automatically approved.

Assessment Notification Email

The Assessment Notification Email field is used to enter the email addresses of ARDC Quality Assessors for a data source. The field is only available when the 'Quality Assessment Required' flag has been checked (On) and can only be managed by ARDC staff. Multiple email addresses can be entered by separating them with a comma. The address will be used in notifying the assessor(s) when a data source's records are submitted for assessment.

return to menu

Harvester Settings

These settings control all aspects of harvesting metadata from your data source.

Harvest Method

The Harvest Method specifies the means by which your metadata will be retrieved from the harvest point URI.

  • CKAN Harvester: (Custom method) harvester connects to CKAN instance and downloads JSON in format specified in the Provider Type.
  • CKANQUERY Harvester: (Custom method) harvester requests JSON metadata from CKAN API using the provided URL. Note that you do not need to add the "start" and "rows" parameters to your URL as these are automatically set by the harvester.
  • CSW Harvester: (Custom method) harvester connects to CSW instance and downloads XML in format specified in the Provider Type.
  • GET Harvester: harvester connects directly to the Data Source URI using a HTTP Get and downloads XML in format specified in the Provider Type.
  • JSONLD Harvester: (Custom method) A sitemap crawler and JSON-LD content extractor. Requires a URL pointing to a sitemap file, it can be text or xml (either <sitemapindex> or <urlset>). A default crosswalk from JSON-LD to RIF-CS will be run on import. This can be overwritten by adding your own crosswalk to the harvest settings (see Add Crosswalk & Supporting File).
  • MAGDAQUERY Harvester: (Custom method) harvester requests JSON metadata from MAGDA Search API using the provided URL. Note that you do not need to add the "start" and "limit" parameters to your URL as these are automatically set by the harvester.
  • OAI-PMH Harvester: (Custom method) harvester connects to your OAI-PMH feed using the OAI-PMH protocol and downloads in format specified in the Provider Type. Includes supporting services such as resumption tokens in case the harvest process fails during harvest, and error and exception reporting.
  • PURE Harvester: (Custom method) a simple dataset harvester using the Elsevier Pure API
Please note that custom harvest methods require additional configuration (e.g. Adding a crosswalk) to take place before they can be used. Additional harvest methods can also be supported. For more information please contact services@ardc.edu.au

return to menu

URI

The location of the Data Source, from which documents are harvested by ARDC. Also known as the harvest point.

  • Any HTTP or HTTPS URL pointing to a RIF-CS XML feed for a DIRECT (HTTP) provider. Example: http://devl.ardc.org.au/test/aims.xml.
  • For an OAI-PMH provider, the base URL of the Data Provider. Example: http://mest.ivec.org/geonetwork/srv/oaipmh

Troubleshooting: Make sure you don't accidentally specify leading or trailing spaces when you enter your harvest URI-this is one of the causes of harvests failing.

return to menu

Harvest Params

The Harvest Params fields are only displayed where a Harvest Method of 'CSW Harvester' is selected. The fields are displayed as Name-Value pairs, and allow you to customise the parameters passed in a CSW harvest request.

By default the system provides the minimum set of CSW parameters required. You may change the value for these default parameters but should not delete them. If removed the system will automatically reinsert them with the default value upon save. To add additional parameters to the request, simply click the 'Add Parameters' button displayed at the bottom of the table and enter the parameter name and value in the added fields.

return to menu

OAI Set

The OAI Set field is available with a Harvest Method of OAI-PMH. It allows you to instruct the ARDC harvester to retrieve records from a specific set in your OAI-PMH feed. This is especially useful when a new subset of records need to be processed through quality assessment while other records are already in production.

return to menu

Metadata Prefix

The Metadata Prefix drop down is available with a Harvest Method of OAI-PMH. The drop down allows you to tell the harvester what metadata prefix should be requested from your OAI-PMH service. ‘rif’ (i.e. RIF-CS) is the default value. Additional prefixes can be added to your data source by using the ‘Add Crosswalk’ button, however if the format of the data is not RIF-CS a crosswalk will need to be uploaded.

return to menu

Output Schema

The Output Schema drop down is available with a Harvest Method of CSW. The drop down allows you to specify the ‘outputSchema’ parameter which is required by CSW service requests. The parameter specifies the return format for the CSW request and must be set to the URI of the output metadata schema (e.g. http://www.isotc211.org/2005/gmd).

return to menu

Provider Type

The Provider Type drop down is available with Harvest Methods GET and CKAN. The Provider Type value describes the kind of metadata document you are making available for harvest and is used to identify a matching crosswalk in your Data Source. ‘rif’ (i.e. RIF-CS) is the default value and the type expected by the registry for ingest. Other provider types (e.g. Dublin Core, ISO19139, CSV ,etc) can configured for your data source by adding a crosswalk from the type to RIF-CS.

return to menu

Add Crosswalk & Supporting File

The 'Add Crosswalk' button allows users to upload an XLST crosswalk and configure it to be run during a harvest. Multiple crosswalks can be uploaded but each needs to be uniquely identified by specifying a Crosswalk Provider Type, Output Format or Metadata Prefix. Once uploaded and identified the crosswalk will be available for selection from the Provider Type, Output Format or Metadata Prefix drop down (dependant on harvest configuration). When selected, a green 'active' label will display next to the crosswalk.

The 'Add Supporting File' button can be used to upload a supporting file which can be refereced and used from within an XSLT crosswalk. Supporting files must be in a format XML or XSL

return to menu

Advanced Harvest Mode

One of three harvest modes can be selected for ingesting records, these modes are as follows:

  • Standard Mode: Standard mode will ingest all records from the data source feed (Standard is the default mode)
  • Incremental Mode: The Incremental Mode will use OAI-PMH to support harvesting on a "from and until" basis. This means that only records that have been created or modified "from" the last harvest date, "until" the date of the new harvest will be included within the Record providers existing dataset.



    The above example shows that after ingest record 8 & 9 will be included into the data source, while the original record 4 will be replaced with the newly ingested record 4.

    Note: This mode will only function if the harvest method is set to " Harvested (OAI-PMH)", and if the OAI-PMH provider correctly supports the "from and until" parameters as per the OAI-PMH Specification.
  • Full Refresh Mode: The Full Refresh Mode will remove all previously harvested records and update the dataset with the latest ingested records. Records created manually using the Add Registry Object screens will not be removed when using this harvest mode.



    The above example shows that records 1, 2, 5 & 6 are removed, new records 8 & 9 are placed in the data source and record 4 is replaced. Also record 3 was not removed as it was manually created using the RDA's Add Registry Object screen.
return to menu

Harvest Date

Use the 'Harvest Date' field to specify a future date and time when a harvest will occur or when regular reoccurring harvests begin. The calendar widget allows nomination of a time zone.

return to menu

Harvest Frequency

Use the 'Harvest Frequency' drop down to specify the frequency for your harvest.

return to menu

Support

If you are experiencing any issues with the page or have questions/comments, please email services@ardc.edu.au. A JIRA ticket will automatically be raised for your request and you will receive an email from the JIRA system with a link to the ticket. Opening the link to the ticket allows you to track and update the issue. You will also receive emails from the system whenever the ticket is updated.