Warning: These documents are under active development and subject to change (version 2.1.0-beta).
The latest release documents are at: https://purl.dataone.org/architecture

Member Node APIs

The service interfaces described here are exposed through the Member Node REST interface to support interactions with Coordinating Nodes and DataONE clients.

The following table provides a list of API methods exposed by Member Nodes.

Tier:The tier in which a method is grouped.
Version:Version of API method is available. The lowest version number indicates when the method was added. A version number in parentheses indicates the method is available in that version and is unchanged from the previous version. If more than one version number is present, then the method signature or functionality has changed between API versions. e.g. “1.0, 2.0” indicates that the method was first introduced in Version 1.0 and has been modified in Version 2.0.
REST:The HTTP method and path relative to the Base URL. Parameters specified in the URL are indicatd by braces. Note that parameters included in a path MUST be properly path encoded, and parameters included as key, value pairs MUST also be properly encoded.
Function:The function name, associated with an API grouping.
Parameters:Indicates the parameters used when calling the method (sent in the message payload) and the return type.
Methods for MN component
Tier Version REST Function Parameters
Tier 1 1.0 GET /monitor/ping MNCore.ping() () -> null
Tier 1 1.0, 2.0 GET /log?[fromDate={fromDate}][&toDate={toDate}][&event={event}][&idFilter={idFilter}][&start={start}][&count={count}] MNCore.getLogRecords() (session, [fromDate], [toDate], [event], [idFilter], [start=0], [count=1000]) -> Types.Log
Tier 1 1.0 GET /  and  GET /node MNCore.getCapabilities() () -> Types.Node
Tier 1 1.0 GET /object/{id} MNRead.get() (session, id) -> Types.OctetStream
Tier 1 1.0 GET /meta/{id} MNRead.getSystemMetadata() (session, id) -> Types.SystemMetadata
Tier 1 1.0 HEAD /object/{id} MNRead.describe() (session, id) -> Types.DescribeResponse
Tier 1 1.0 GET /checksum/{pid}[?checksumAlgorithm={checksumAlgorithm}] MNRead.getChecksum() (session, pid, [checksumAlgorithm]) -> Types.Checksum
Tier 1 1.0 GET /object[?fromDate={fromDate}&toDate={toDate}&identifier={identifier}&formatId={formatId}&replicaStatus={replicaStatus} &start={start}&count={count}] MNRead.listObjects() (session, [fromDate], [toDate], [formatId], [identifier], [replicaStatus], [start=0], [count=1000]) -> Types.ObjectList
Tier 1   POST /error MNRead.synchronizationFailed() (session, message) -> Types.Boolean
Tier 1 1.0 POST /dirtySystemMetadata MNRead.systemMetadataChanged() (session, id, serialVersion, dateSysMetaLastModified) -> boolean
Tier 1 1.0 GET /replica/{pid} MNRead.getReplica() (session, pid) -> Types.OctetStream
Tier 2 1.0 GET /isAuthorized/{id}?action={action} MNAuthorization.isAuthorized() (session, id, action) -> boolean
Tier 3 1.0 POST /object MNStorage.create() (session, pid, object, sysmeta) -> Types.Identifier
Tier 3 1.0 PUT /object/{pid} MNStorage.update() (session, pid, object, newPid, sysmeta) -> Types.Identifier
Tier 3 1.0 POST /generate MNStorage.generateIdentifier() (session, scheme, [fragment]) -> Types.Identifier
Tier 3 1.0 DELETE /object/{id} MNStorage.delete() (session, id) -> Types.Identifier
Tier 3 1.0 PUT /archive/{id} MNStorage.archive() (session, id) -> Types.Identifier
Tier 1 2.0 PUT /meta MNStorage.updateSystemMetadata() (session, pid, sysmeta) -> boolean
Tier 4 1.0 POST /replicate MNReplication.replicate() (session, sysmeta, sourceNode) -> boolean
Tier 1 1.1 GET /query/{queryEngine}/{query} MNQuery.query() (session, queryEngine, query) -> Types.OctetStream
Tier 1 1.1 GET /query/{queryType} MNQuery.getQueryEngineDescription() (session, queryEngine) -> Types.QueryEngineDescription
Tier 1 1.1 GET /query MNQuery.listQueryEngines() (session) -> Types.QueryEngineList
Tier 1 1.2 GET /views/{theme}/{pid} MNView.view() (session, theme, id) -> Types.OctetStream
Tier 1 1.2 GET /views MNView.listViews() (session) -> Types.OptionList
Tier 1 1.2 GET /packages/{packageType}/{pid} MNPackage.getPackage() (session, packageType, id) -> Types.OctetStream

Core API

The MN_core API provides mechanisms for a Member Node to report on the level of service compliance and to specify replication policies. The capabilities information is used in the Member Node registration process by the Coordinating Nodes.

The state of health API provides mechanisms for the monitoring infrastructure to report on the current state of the DataONE infrastructure and for the Coordinating Nodes to track the current operating state of the Member Node.

Functions defined in MNCore
Tier Version REST Function Parameters
Tier 1 1.0 GET /monitor/ping ping() () -> null
Tier 1 1.0, 2.0 GET /log?[fromDate={fromDate}][&toDate={toDate}][&event={event}][&idFilter={idFilter}][&start={start}][&count={count}] getLogRecords() (session, [fromDate], [toDate], [event], [idFilter], [start=0], [count=1000]) -> Types.Log
Tier 1 1.0 GET /  and  GET /node getCapabilities() () -> Types.Node
MNCore.ping() → null

Low level “are you alive” operation. A valid ping response is indicated by a HTTP status of 200. A timestmap indicating the current system time (UTC) on the node MUST be returned in the HTTP Date header.

The Member Node should perform some minimal internal functionality testing before answering. However, ping checks will be frequent (every few minutes) so the internal functionality test should not be high impact.

Any status response other than 200 indicates that the node is offline for DataONE operations.

Note that the timestamp returned in the Date header should follow the semantics as described in the HTTP specifications, http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.18

The response body will be ignored by the caller except in the case of an error, in which case the response body should contain the appropriate DataONE exception.

Version:

1.0

Use Cases:

UC10

REST URL:

GET /monitor/ping

Returns:

Null body or Exception. The body of the message may be ignored by the caller. The HTTP header Date MUST be set in the response.

Return type:

null

Raises:
  • Exceptions.NotImplemented

    (errorCode=501, detailCode=2041)

    Ping is a required operation and so an operational member node should never return this exception unless under development.

  • Exceptions.ServiceFailure

    (errorCode=500, detailCode=2042)

    A ServiceFailure exception indicates that the node is not currently operational as a member node. A coordinating node or monitoring service may use this as an indication that the member node should be taken out of the pool of active nodes, though ping should be called on a regular basis to determine when the node might b ready to resume normal operations.

  • Exceptions.InsufficientResources

    (errorCode=413, detailCode=2045)

    A ping response may return InsufficientResources if for example the system is in a state where normal DataONE operations may be impeded by an unusually high load on the node.

Response

The response should be a valid HTTP response with a blank or arbitrary body. Only the HTTP header information is considered by the requestor. A successful response MUST have a HTTP status code of 200. In case of an error condition, the appropriate HTTP status code MUST be set, and an exception or error information MAY be returned in the response body.

Example

Example of ping request and response for a Member Node (Coordinating Nodes implement the same functionality). Lines prefixed with “>” indicate outgoing information, lines prefixed with “<” show content returned from the server. Lines associated with SSL connection initiation and close are not shown here. Note that the actual response headers may vary, the only required header fields are the first status line and a Date entry. However, in order to fully support clients that may cache the response, it is recommended that the Expires, and Cache-Control headers are returned.

export NODE="https://demo2.test.dataone.org/knb/d1/mn"
curl -k -v "$NODE/v1/monitor/ping"

> GET /knb/d1/mn/v1/monitor/ping HTTP/1.1
> User-Agent: curl/7.21.6 (x86_64-pc-linux-gnu) libcurl/7.21.6
  OpenSSL/1.0.0e zlib/1.2.3.4 libidn/1.22 librtmp/2.3
> Host: demo2.test.dataone.org
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Tue, 06 Mar 2012 14:19:59 GMT
< Server: Apache/2.2.14 (Ubuntu)
< Content-Length: 0
< Content-Type: text/plain
<
MNCore.getLogRecords(session[, fromDate][, toDate][, event][, idFilter][, start=0][, count=1000]) → Log

Retrieve log information from the Member Node for the specified slice parameters. Log entries will only return PIDs.

This method is used primarily by the log aggregator to generate aggregate statistics for nodes, objects, and the methods of access.

The response MUST contain only records for which the requestor has permission to read.

Note that date time precision is limited to one millisecond. If no timezone information is provided UTC will be assumed.

Access control for this method MUST be configured to allow calling by Coordinating Nodes and MAY be configured to allow more general access.

v2.0: The event parameter has changed from v1_0.Types.Event to a plain string

v2.0: The structure of v2_0.Types.Log has changed.

Version:

1.0, 2.0

REST URL:

GET /log?[fromDate={fromDate}][&toDate={toDate}][&event={event}][&idFilter={idFilter}][&start={start}][&count={count}]

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • fromDate (Types.DateTime) – Records with time stamp greater than or equal to (>=) this value will be returned. Transmitted as a URL query parameter, and so must be escaped accordingly.
  • toDate (Types.DateTime) – Records with a time stamp less than (<) this value will be returned. If not specified, then defaults to now. Transmitted as a URL query parameter, and so must be escaped accordingly.
  • event (Types.Event, string) – Return only log records for the specified type of event. Default is all. Transmitted as a URL query parameter, and so must be escaped accordingly.
  • idFilter (string) – Return only log records for identifiers that start with the supplied identifier string. Support for this parameter is optional and MAY be ignored by the Member Node implementation with no warning. Accepts PIDs and SIDs Transmitted as a URL query parameter, and so must be escaped accordingly.
  • start=0 (integer) – Optional zero based offset from the first record in the set of matching log records. Used to assist with paging the response. Transmitted as a URL query parameter, and so must be escaped accordingly.
  • count=1000 (integer) – The maximum number of log records that should be returned in the response. The Member Node may return fewer and the caller should check the total in the response to determine if further pages may be retrieved. Transmitted as a URL query parameter, and so must be escaped accordingly.
Returns:

Return type:

Types.Log

Raises:

Example

Example of retrieving 3 log records from a Member Node. The xml command is provided by xmlstarlet and is used to format the output.

export NODE="https://demo2.test.dataone.org/knb/d1/mn"
curl -k -s "$NODE/v1/log?start=0&count=3" | xml fo

<?xml version="1.0" encoding="UTF-8"?>
<d1:log xmlns:d1="http://ns.dataone.org/service/types/v1" count="3" start="0" total="1273">
  <logEntry>
    <entryId>1</entryId>
    <identifier>MNodeTierTests.201260152556757.</identifier>
    <ipAddress>129.24.0.17</ipAddress>
    <userAgent>null</userAgent>
    <subject>CN=testSubmitter,DC=dataone,DC=org</subject>
    <event>create</event>
    <dateLogged>2012-02-29T23:25:58.104+00:00</dateLogged>
    <nodeIdentifier>urn:node:DEMO2</nodeIdentifier>
  </logEntry>
  <logEntry>
    <entryId>2</entryId>
    <identifier>TierTesting:testObject:RightsHolder_Person.4</identifier>
    <ipAddress>129.24.0.17</ipAddress>
    <userAgent>null</userAgent>
    <subject>CN=testSubmitter,DC=dataone,DC=org</subject>
    <event>create</event>
    <dateLogged>2012-02-29T23:26:38.828+00:00</dateLogged>
    <nodeIdentifier>urn:node:DEMO2</nodeIdentifier>
  </logEntry>
  <logEntry>
    <entryId>3</entryId>
    <identifier>TierTesting:testObject:RightsHolder_Group.4</identifier>
    <ipAddress>129.24.0.17</ipAddress>
    <userAgent>null</userAgent>
    <subject>CN=testSubmitter,DC=dataone,DC=org</subject>
    <event>create</event>
    <dateLogged>2012-02-29T23:27:40.255+00:00</dateLogged>
    <nodeIdentifier>urn:node:DEMO2</nodeIdentifier>
  </logEntry>
</d1:log>
MNCore.getCapabilities() → Node

Returns a document describing the capabilities of the Member Node.

The response at the Member Node base URL is for convenience only. Clients of Member Nodes SHOULD use the /node URL to retrieve the node capabilities document.

Version:

1.0

REST URL:

GET /  and  GET /node

Returns:

The technical capabilities of the Member Node

Return type:

Types.Node

Raises:

Example

export NODE="https://demo2.test.dataone.org/knb/d1/mn"
curl -k -s "$NODE/v1/node" | xml fo

<?xml version="1.0" encoding="UTF-8"?>
<d1:node xmlns:d1="http://ns.dataone.org/service/types/v1" replicate="true" synchronize="true" type="mn" state="up">
  <identifier>urn:node:DEMO2</identifier>
  <name>DEMO2 Metacat Node</name>
  <description>A DataONE member node implemented in Metacat.</description>
  <baseURL>https://demo2.test.dataone.org:443/knb/d1/mn</baseURL>
  <services>
    <service name="MNRead" version="v1" available="true"/>
    <service name="MNCore" version="v1" available="true"/>
    <service name="MNAuthorization" version="v1" available="true"/>
    <service name="MNStorage" version="v1" available="true"/>
    <service name="MNReplication" version="v1" available="true"/>
  </services>
  <synchronization>
    <schedule hour="*" mday="*" min="0/3" mon="*" sec="10" wday="?" year="*"/>
    <lastHarvested>2012-03-06T14:57:39.851+00:00</lastHarvested>
    <lastCompleteHarvest>2012-03-06T14:57:39.851+00:00</lastCompleteHarvest>
  </synchronization>
  <ping success="true"/>
  <subject>CN=urn:node:DEMO2, DC=dataone, DC=org</subject>
  <contactSubject>CN=METACAT1, DC=dataone, DC=org</contactSubject>
</d1:node>

Read API

The MNRead API implements methods that enable object management operations on a Member Node.

Functions defined in MNRead
Tier Version REST Function Parameters
Tier 1 1.0 GET /object/{id} get() (session, id) -> Types.OctetStream
Tier 1 1.0 GET /meta/{id} getSystemMetadata() (session, id) -> Types.SystemMetadata
Tier 1 1.0 HEAD /object/{id} describe() (session, id) -> Types.DescribeResponse
Tier 1 1.0 GET /checksum/{pid}[?checksumAlgorithm={checksumAlgorithm}] getChecksum() (session, pid, [checksumAlgorithm]) -> Types.Checksum
Tier 1 1.0 GET /object[?fromDate={fromDate}&toDate={toDate}&identifier={identifier}&formatId={formatId}&replicaStatus={replicaStatus} &start={start}&count={count}] listObjects() (session, [fromDate], [toDate], [formatId], [identifier], [replicaStatus], [start=0], [count=1000]) -> Types.ObjectList
Tier 1   POST /error synchronizationFailed() (session, message) -> Types.Boolean
Tier 1 1.0 POST /dirtySystemMetadata systemMetadataChanged() (session, id, serialVersion, dateSysMetaLastModified) -> boolean
Tier 1 1.0 GET /replica/{pid} getReplica() (session, pid) -> Types.OctetStream
MNRead.get(session, id) → OctetStream

Retrieve an object identified by id from the node. Supports both PIDs and SIDs. SID will return HEAD PID.

The response MUST contain the bytes of the indicated object, and the checksum of the bytes retrieved SHOULD match the SystemMetadata.checksum recorded in the Types.SystemMetadata when calling with PID.

If the object does not exist on the node servicing the request, then Exceptions.NotFound must be raised even if the object exists on another node in the DataONE system.

Also implmented by Coordinating Nodes as CNRead.get().

Version:

1.0

Use Cases:

UC01, UC06, UC16

REST URL:

GET /object/{id}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • id (Types.Identifier) – The identifier for the object to be retrieved. May be a PID or a SID. Transmitted as part of the URL path and must be escaped accordingly.
Returns:

Bytes of the specified object.

Return type:

Types.OctetStream

Raises:

Examples

(GET) Retrieve the object with identifier “XYZ332”:

export NODE="https://demo2.test.dataone.org/knb/d1/mn"
curl -k "$NODE/v1/object/XYZ332"

... data ...

(GET) Attempt to retrieve a non-existent object (and show headers in response):

export NODE="https://demo2.test.dataone.org/knb/d1/mn"
curl -D - "$NODE/v1/object/DOESNTEXIST"

HTTP/1.1 404 Not Found
Date: Tue, 06 Mar 2012 15:25:35 GMT
Server: Apache/2.2.14 (Ubuntu)
Content-Length: 196
Vary: Accept-Encoding
Content-Type: text/xml

<?xml version="1.0" encoding="UTF-8"?>
<error detailCode="1800" errorCode="404" name="NotFound">
   <description>No system metadata could be found for given PID: DOESNTEXIST</description>
</error>
MNRead.getSystemMetadata(session, id) → SystemMetadata

Describes the object identified by id by returning the associated system metadata object.

If the object does not exist on the node servicing the request, then Exceptions.NotFound MUST be raised even if the object exists on another node in the DataONE system.

Version:

1.0

Use Cases:

UC06, UC37, UC16

REST URL:

GET /meta/{id}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • id (Types.Identifier) – Identifier for the science data or science metedata object of interest. May be either a PID or a SID. Transmitted as part of the URL path and must be escaped accordingly.
Returns:

System metadata object describing the object.

Return type:

Types.SystemMetadata

Raises:

Examples

(GET) Retrieve system metadata from a Member Node for object “XYZ332” which happens to be science metadata (an EML document) that has been obsoleted by a new version with identifier “XYZ33”:

curl http://m1.dataone.org/mn/v1/meta/XYZ332

<?xml version="1.0" encoding="UTF-8"?>
<d1:systemMetadata xmlns:d1="http://ns.dataone.org/service/types/v1">
  <serialVersion>1</serialVersion>
  <identifier>XYZ332</identifier>
  <formatId>eml://ecoinformatics.org/eml-2.1.0</formatId>
  <size>20875</size>
  <checksum algorithm="MD5">e7451c1775461b13987d7539319ee41f</checksum>
  <submitter>uid=mbauer,o=NCEAS,dc=ecoinformatics,dc=org</submitter>
  <rightsHolder>uid=mbauer,o=NCEAS,dc=ecoinformatics,dc=org</rightsHolder>
  <accessPolicy>
    <allow>
      <subject>uid=jdoe,o=NCEAS,dc=ecoinformatics,dc=org</subject>
      <permission>read</permission>
      <permission>write</permission>
      <permission>changePermission</permission>
    </allow>
    <allow>
      <subject>public</subject>
      <permission>read</permission>
    </allow>
    <allow>
      <subject>uid=nceasadmin,o=NCEAS,dc=ecoinformatics,dc=org</subject>
      <permission>read</permission>
      <permission>write</permission>
      <permission>changePermission</permission>
    </allow>
  </accessPolicy>
  <replicationPolicy replicationAllowed="false"/>
  <obsoletes>XYZ331</obsoletes>
  <obsoletedBy>XYZ333</obsoletedBy>
  <archived>true</archived>
  <dateUploaded>2008-04-01T23:00:00.000+00:00</dateUploaded>
  <dateSysMetadataModified>2012-06-26T03:51:25.058+00:00</dateSysMetadataModified>
  <originMemberNode>urn:node:TEST</originMemberNode>
  <authoritativeMemberNode>urn:node:TEST</authoritativeMemberNode>
</d1:systemMetadata>

(GET) Attempt to retrieve system metadata for an object that does not exist.:

curl http://cn.dataone.org/cn/v1/meta/SomeObjectID

<?xml version="1.0" encoding="UTF-8"?>
<error detailCode="1800" errorCode="404" name="NotFound">
  <description>No system metadata could be found for given PID: SomeObjectID</description>
</error>
MNRead.describe(session, id) → DescribeResponse

This method provides a lighter weight mechanism than MNRead.getSystemMetadata() for a client to determine basic properties of the referenced object. The response should indicate properties that are typically returned in a HTTP HEAD request: the date late modified, the size of the object, the type of the object (the SystemMetadata.formatId).

The principal indicated by token must have read privileges on the object, otherwise Exceptions.NotAuthorized is raised.

If the object does not exist on the node servicing the request, then Exceptions.NotFound must be raised even if the object exists on another node in the DataONE system.

Note that this method is likely to be called frequently and so efficiency should be taken into consideration during implementation.

Version:

1.0

Use Cases:

UC16

REST URL:

HEAD /object/{id}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • id (Types.Identifier) – Identifier for the object in question. May be either a PID or a SID. Transmitted as part of the URL path and must be escaped accordingly.
Returns:

A set of values providing a basic description of the object.

Return type:

Types.DescribeResponse

Raises:

Examples

(HEAD) Retrieve information about the object with identifier “ABC123”:

curl -I http://mn1.dataone.org/mn/v1/object/ABC123

HTTP/1.1 200 OK
Last-Modified: Wed, 16 Dec 2009 13:58:34 GMT
Content-Length: 10400
Content-Type: application/octet-stream
DataONE-ObjectFormat: eml://ecoinformatics.org/eml-2.0.1
DataONE-Checksum: SHA-1,2e01e17467891f7c933dbaa00e1459d23db3fe4f
DataONE-SerialVersion: 1234

(HEAD) An error response to a describe() request for object “IDONTEXIST”:

curl -I http://mn1.dataone.org/mn/v1/object/IDONTEXIST

HTTP/1.1 404 Not Found
Last-Modified: Wed, 16 Dec 2009 13:58:34 GMT
Content-Length: 1182
Content-Type: text/xml
DataONE-Exception-Name: NotFound
DataONE-Exception-DetailCode: 1380
DataONE-Exception-Description: The specified object does not exist on this node.
DataONE-Exception-PID: IDONTEXIST
MNRead.getChecksum(session, pid[, checksumAlgorithm]) → Checksum

Returns a Types.Checksum for the specified object using an accepted hashing algorithm. The result is used to determine if two instances referenced by a PID are identical, hence it is necessary that MNs can ensure that the returned checksum is valid for the referenced object either by computing it on the fly or by using a cached value that is certain to be correct.

Version:

1.0

REST URL:

GET /checksum/{pid}[?checksumAlgorithm={checksumAlgorithm}]

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • pid (Types.Identifier) – The identifier of the object the operation is being performed on. Transmitted as part of the URL path and must be escaped accordingly.
  • checksumAlgorithm (string) – The name of an algorithm that will be used to compute a checksum of the bytes of the object. This value is drawn from a DataONE controlled list of values as indicted in the Types.SystemMetadata. If not specified, then the system wide default checksum algorithm should be used. Transmitted as a URL query parameter, and so must be escaped accordingly.
Returns:

The checksum value originally computed for the specified object.

Return type:

Types.Checksum

Raises:
MNRead.listObjects(session[, fromDate][, toDate][, formatId][, identifier][, replicaStatus][, start=0][, count=1000]) → ObjectList

Retrieve the list of objects present on the MN that match the calling parameters. This method is required to support the process of Member Node synchronization. At a minimum, this method MUST be able to return a list of objects that match:

fromDate < SystemMetadata.dateSysMetadataModified

but is expected to also support date range (by also specifying toDate), and should also support slicing of the matching set of records by indicating the starting index of the response (where 0 is the index of the first item) and the count of elements to be returned.

Note that date time precision is limited to one millisecond. If no timezone information is provided, the UTC will be assumed.

Note that date time precision is limited to one millisecond. If no timezone information is provided, the UTC will be assumed.

Version:

1.0

Use Cases:

UC06, UC16

REST URL:

GET /object[?fromDate={fromDate}&toDate={toDate}&identifier={identifier}&formatId={formatId}&replicaStatus={replicaStatus} &start={start}&count={count}]

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • fromDate (Types.DateTime) – Entries with SystemMetadata.dateSysMetadataModified greater than or equal to (>=) fromDate must be returned. Transmitted as a URL query parameter, and so must be escaped accordingly.
  • toDate (Types.DateTime) – Entries with SystemMetadata.dateSysMetadataModified less than (<) toDate must be returned. Transmitted as a URL query parameter, and so must be escaped accordingly.
  • formatId (Types.ObjectFormatIdentifier) – Restrict results to the specified object format identifier. Transmitted as a URL query parameter, and so must be escaped accordingly.
  • identifier (Types.Identifier) – Restrict results to the specified identifier. May be a PID or a SID. In the case of the latter, returns a listing of all PIDs that share the given SID. Transmitted as a URL query parameter, and so must be escaped accordingly.
  • replicaStatus (boolean) – Indicates if replicated objects should be returned in the list (i.e. any entries present in the SystemMetadata.replica, objects that have been replicated to this node). If false, then no objects that have been replicated should be returned. If true, then any objects can be returned, regardless of replication status. The default value is true. Transmitted as a URL query parameter, and so must be escaped accordingly.
  • start=0 (integer) – The zero-based index of the first value, relative to the first record of the resultset that matches the parameters. Transmitted as a URL query parameter, and so must be escaped accordingly.
  • count=1000 (integer) – The maximum number of entries that should be returned in the response. The Member Node may return fewer and the caller should check the total in the response to determine if further pages may be retrieved. Transmitted as a URL query parameter, and so must be escaped accordingly.
Returns:

The list of PIDs that match the query criteria. If none match, an empty list is returned.

Return type:

Types.ObjectList

Raises:

Example

Retrieve an object list from a member node, and pipe the response through an xml formatter for easier viewing:

curl "https://gmn-dev.test.dataone.org/mn/v1/object?count=5" | xml fo

<?xml version="1.0"?>
<ns1:objectList xmlns:ns1="http://ns.dataone.org/service/types/v1" count="5" start="0" total="12">
  <objectInfo>
    <identifier>AnserMatrix.htm</identifier>
    <formatId>eml://ecoinformatics.org/eml-2.0.0</formatId>
    <checksum algorithm="MD5">0e25cf59d7bd4d57154cc83e0aa32b34</checksum>
    <dateSysMetadataModified>1970-05-27T06:12:49</dateSysMetadataModified>
    <size>11048</size>
  </objectInfo>

  ...

  <objectInfo>
    <identifier>hdl:10255/dryad.218/mets.xml</identifier>
    <formatId>eml://ecoinformatics.org/eml-2.0.0</formatId>
    <checksum algorithm="MD5">65c4e0a9c4ccf37c1e3ecaaa2541e9d5</checksum>
    <dateSysMetadataModified>1987-01-14T07:09:09</dateSysMetadataModified>
    <size>2796</size>
  </objectInfo>
</ns1:objectList>
MNRead.synchronizationFailed(session, message) → Boolean

This is a callback method used by a CN to indicate to a MN that it cannot complete synchronization of the science metadata identified by pid. When called, the MN should take steps to record the problem description and notify an administrator or the data owner of the issue.

A successful response is indicated by a HTTP status of 200. An unsuccessful call is indicated by a returned exception and associated HTTP status code.

Access control for this method MUST be configured to allow calling by Coordinating Nodes and MAY be configured to allow more general access.

Version:
Use Cases:

UC06

REST URL:

POST /error

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • message (Types.Exception) – An instance of the Exceptions.SynchronizationFailed exception with body appropriately filled. Transmitted as an UTF-8 encoded XML structure for the respective type as defined in the DataONE types schema, as a File part of the MIME multipart/mixed message.
Returns:

A successful response is indicated by a HTTP 200 status. An unsuccessful call is indicated by returing the appropriate exception.

Return type:

Types.Boolean

Raises:
MNRead.systemMetadataChanged(session, id, serialVersion, dateSysMetaLastModified) → boolean

Notifies the Member Node that the authoritative copy of system metadata on the Coordinating Nodes has changed.

The Member Node SHOULD schedule an update to its information about the affected object by retrieving an authoritative copy from a Coordinating Node.

Note that date time precision is limited to one millisecond.

Access control for this method MUST be configured to allow calling by Coordinating Nodes.

Version:

1.0

REST URL:

POST /dirtySystemMetadata

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • id (Types.Identifier) – Identifier of the object for which system metadata was changed. May be either a PID or a SID. Calling with SID is equivalent to calling with HEAD PID. Transmitted as a UTF-8 String as a Param part of the MIME multipart/mixed message.
  • serialVersion (unsigned long) – The serialVersion of the system metadata. Transmitted as a UTF-8 String as a Param part of the MIME multipart/mixed message.
  • dateSysMetaLastModified (Types.DateTime) – The time stamp for when the system metadata was changed. Transmitted as a UTF-8 String as a Param part of the MIME multipart/mixed message.
Returns:

True if notification was received OK, otherwise an error is returned.

Return type:

boolean

Raises:
MNRead.getReplica(session, pid) → OctetStream

Called by a target Member Node to fullfill the replication request originated by a Coordinating Node calling MNReplication.replicate(). This is a request to make a replica copy of the object, and differs from a call to GET /object in that it should be logged as a replication event rather than a read event on that object.

If the object being retrieved is restricted access, then a Tier 2 or higher Member Node MUST make a call to CNReplication.isNodeAuthorized() to verify that the Subject of the caller is authorized to retrieve the content.

A successful operation is indicated by a HTTP status of 200 on the response.

Failure of the operation MUST be indicated by returning an appropriate exception.

Version:

1.0

Use Cases:

UC09

REST URL:

GET /replica/{pid}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • pid (Types.Identifier) – The identifier of the object to get as a replica Transmitted as part of the URL path and must be escaped accordingly.
Returns:

Bytes of the specified object.

Return type:

Types.OctetStream

Raises:

Query API

The MNQuery API is an optional API that may be implemented by Member Nodes that intend to support querying the local repository. The actual form of the query is undefined, and t is expected that a small set of well known query engine types will be supported.

Functions defined in MNQuery
Tier Version REST Function Parameters
Tier 1 1.1 GET /query/{queryEngine}/{query} query() (session, queryEngine, query) -> Types.OctetStream
Tier 1 1.1 GET /query/{queryType} getQueryEngineDescription() (session, queryEngine) -> Types.QueryEngineDescription
Tier 1 1.1 GET /query listQueryEngines() (session) -> Types.QueryEngineList
MNQuery.query(session, queryEngine, query) → OctetStream

Submit a query against the specified queryEngine and return the response as formatted by the queryEngine.

The MNQuery.query() operation may be implemented by more than one type of search engine and the queryEngine parameter indicates which search engine is targeted. The value and form of query is determined by the specific query engine.

For example, the SOLR search engine will accept many of the standard parameters of SOLR, including field restrictions and faceting.

This method is optional for Member Nodes, but if implemented, both getQueryEngineDescription and listQueryEngines must also be implemented.

Version:

1.1

Use Cases:

UC02, UC16

REST URL:

GET /query/{queryEngine}/{query}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • queryEngine (string) – Indicates which search engine will be used to handle the query. Supported search engines can be determined through the MNQuery.listQueryEngines API call. Transmitted as part of the URL path and must be escaped accordingly.
  • query (string) – The remainder of the URL is passed verbatim to the respective search engine implementation. Hence it may contain additional path elements and query elements as determined by the functionality of the search engine. The caller is reponsible for providing a ‘?’ to indicate the start of the query string portion of the URL, as well as proper URL escaping. Transmitted as part of the URL path and must be escaped accordingly.
Returns:

The structure of the response is determined by the chosen search engine and parameters provided to it.

Return type:

Types.OctetStream

Raises:
MNQuery.getQueryEngineDescription(session, queryEngine) → QueryEngineDescription

Provides metadata about the query service of the specified queryEngine. The metadata provides a brief description of the query engine, its version, its schema version, and an optional list of fields supported by the query engine.

Version:

1.1

REST URL:

GET /query/{queryType}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate provided with the request. The certificate must be traceable to an authority recognized by DataONE, currently CILogon. Transmitted as part of the SSL handshake process.
  • queryEngine (string) – Indicates which query engine for which to provide descriptive metadata. Currently supported search engines can be determined through MNQuery.listQueryEngines. Transmitted as part of the URL path and must be escaped accordingly.
Returns:

A list of fields that are supported by the search index and additional metadata.

Return type:

Types.QueryEngineDescription

Raises:
MNQuery.listQueryEngines(session) → QueryEngineList

Returns a list of query engines, i.e. supported values for the queryEngine parameter of the getQueryEngineDescription and query operations.

The list of search engines available may be influenced by the authentication status of the request.

Version:

1.1

REST URL:

GET /query

Parameters:

session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate provided with the request. The certificate must be traceable to an authority recognized by DataONE, currently CILogon. Transmitted as part of the SSL handshake process.

Returns:

A list of names of queryEngines available to the user identified by session.

Return type:

Types.QueryEngineList

Raises:

View API

The MNView API is an optional API that may be implemented by Member Nodes that intend to support providing rendered views of content on their repository. Each repository can implement multipe themed views of their content, each accesed using the name of the theme and the identifier of the content to be viewed. Unlike the MNRead service, which returns the exact bytes of content, the MNView service provides a rendered view of the content which can transform the content into different formats. The most common use of the view service will likely be to provide a rendered HTML landing page at a well-known URL that can be used to provide a human-readable view of metadata and data. Other potential uses include providing alternative formats for metadata and data. Each Member Node that implements the MNView service must implement at least one theme named ‘default’ which provides the default view of all content. Other themes can be provided for use by various clients.

Functions defined in MNView
Tier Version REST Function Parameters
Tier 1 1.2 GET /views/{theme}/{pid} view() (session, theme, id) -> Types.OctetStream
Tier 1 1.2 GET /views listViews() (session) -> Types.OptionList
MNView.view(session, theme, id) → OctetStream

Provides a formatted view of an object (science metadata, data, resource, or other) using the given named theme.

If this service is implemented, the MNView.view() operation must implement at least one {theme} named ‘default’ to provide a standard (possibly minimalistic) view of the content in HTML format.

If the {theme} parameter is not recognized, the service must render the object using the default theme rather than throwing an error. Note that the return type of Types.OctetStream requires that the consuming client has a priori knowledge of the theme being returned (like HTML). Response headers must include the correct mime-type of the view being returned.

This method is optional for Member Nodes, but if implemented, MNView.listViews must also be implemented.

Version:

1.2

REST URL:

GET /views/{theme}/{pid}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • theme (string) – Indicates which themed view will be used to handle the query. All implementations must support a ‘default’ HTML theme, but are free to implement additional themes that return both HTML and non-HTML responses. Transmitted as part of the URL path and must be escaped accordingly.
  • id (Types.Identifier) – The identifier of the object to render in a view. May be either a PID or a SID. Transmitted as part of the URL path and must be escaped accordingly.
Returns:

Any return type is allowed, including application/octet-stream, but the format of the response should be specialized by the requested theme.

Return type:

Types.OctetStream

Raises:
MNView.listViews(session) → OptionList

Provides a list of usable themes for rendering content in a view, including a required ‘default’ theme. The list of themes is provided as an OptionList, where the option key should be used as the theme name in calls to MNView.view, and the description provides a human readable description of what will be returned fo rthat theme.

This method is optional for Member Nodes, but if implemented, MNView.view must also be implemented.

Version:

1.2

REST URL:

GET /views

Parameters:

session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.

Returns:

A list of available themes that can be used with the MNView.view service.

Return type:

Types.OptionList

Raises:

Package API

The MNPackage API is an optional API that may be implemented by Member Nodes that intend to support downloading all of the contents of a data package in a single API call. Without this service, a client application must individually retrieve each of the metadata and data components of a package as they are listed in the ORE document that describes the package. Using the MNPackage service, a caller can instead request a serialized form of all of the data in a package, which is returned in the format requested. All implementations must support the BagIt format specification, but may also support additional well-defined packaging standards and specifications.

Functions defined in MNPackage
Tier Version REST Function Parameters
Tier 1 1.2 GET /packages/{packageType}/{pid} getPackage() (session, packageType, id) -> Types.OctetStream
MNPackage.getPackage(session, packageType, id) → OctetStream

Provides all of the content of a DataONE data package as defined by an OAI-ORE document in DataONE, in one of several possible package serialization formats. The serialized package will contain all of the data described in the ORE aggregation. The default implementation will include packages in the BagIt format. The packageType formats must be specified using the associated ObjectFormat formatId for that package serialization format.

The {id} parameter must be the identifier of an ORE package object. If it is the identifier of one of the science metadata documents or data files contained within the package, the Member Node should throw an InvalidRequest exception. Identifiers may be either PIDss or SIDs.

This method is optional for Member Nodes.

Version:

1.2

REST URL:

GET /packages/{packageType}/{pid}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • packageType (Types.ObjectFormatIdentifier) – Indicates which package format will be used to serialize the package. All implementations must support a default BagIt package serialization, but are free to implement additional package serialization formats. Transmitted as part of the URL path and must be escaped accordingly.
  • id (Types.Identifier) – The identifier of the package or object in a package to be returned as a serialized package. Transmitted as part of the URL path and must be escaped accordingly.
Returns:

Any return type is allowed, including application/octet-stream, but the format of the response should be specialized by the requested packageType.

Return type:

Types.OctetStream

Raises:

Authorization API

Provides mechanisms Member Nodes to verify access to resources for users (subject). See the document Identity Management and Authenticated Session Management for more details on some authentication options.

Functions defined in MNAuthorization
Tier Version REST Function Parameters
Tier 2 1.0 GET /isAuthorized/{id}?action={action} isAuthorized() (session, id, action) -> boolean
MNAuthorization.isAuthorized(session, id, action) → boolean

Test if the user identified by the provided session has authorization for operation on the specified object.

A successful operation is indicated by a return HTTP status of 200.

Failure is indicated by an exception such as NotAuthorized being returned.

The body of the response is arbitrary and SHOULD be ignored by the caller.

If the action is not authorized, then a NotAuthorized exception MUST be raised.

Note

Should perhaps add convenience methods for “canRead()” and “canWrite()” to verify that a user is able to read / write an object.

Version:

1.0

Use Cases:

UC01, UC37

REST URL:

GET /isAuthorized/{id}?action={action}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • id (Types.Identifier) – The identifer of the resource for which access is being checked. May be either a PID or a SID. Will use the HEAD PID when given a SID value. Transmitted as part of the URL path and must be escaped accordingly.
  • action (Types.Permission) – The type of operation which is being requested for the given pid. Transmitted as a URL query parameter, and so must be escaped accordingly.
Returns:

True if the operation is allowed

Return type:

boolean

Raises:

Storage API

Functions defined in MNStorage
Tier Version REST Function Parameters
Tier 3 1.0 POST /object create() (session, pid, object, sysmeta) -> Types.Identifier
Tier 3 1.0 PUT /object/{pid} update() (session, pid, object, newPid, sysmeta) -> Types.Identifier
Tier 3 1.0 POST /generate generateIdentifier() (session, scheme, [fragment]) -> Types.Identifier
Tier 3 1.0 DELETE /object/{id} delete() (session, id) -> Types.Identifier
Tier 3 1.0 PUT /archive/{id} archive() (session, id) -> Types.Identifier
Tier 1 2.0 PUT /meta updateSystemMetadata() (session, pid, sysmeta) -> boolean
MNStorage.create(session, pid, object, sysmeta) → Identifier

Called by a client to adds a new object to the Member Node.

The pid must not exist in the DataONE system or should have been previously reserved using CNCore.reserveIdentifier(). A new, unique Types.SystemMetadata.seriesId may be included.

The caller MUST have authorization to write or create content on the Member Node.

Version:

1.0

Use Cases:

UC04, UC09, UC16

REST URL:

POST /object

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • pid (Types.Identifier) – The identifier that should be used in DataONE to identify and access the object. This is an Unicode string that follows the constraints on identifiers described in Identifiers in DataONE. If the identifier is already in use, Exceptions.IdentifierNotUnique will be raised and the client SHOULD try again with a different, unique identifier. Transmitted as a UTF-8 String as a Param part of the MIME multipart/mixed message.
  • object (bytes) – The data bytes that are to be added to the Member Node.
  • sysmeta (Types.SystemMetadata) – The system metadata document that provides basic information about the object, including a reference to its identifier, access control information, etc. Attributes of the sysmeta that are the responsibility of the client MUST be set. Note that the obsoletes and obsoletedBy elements MUST not be set. It is the role of the update() method to ensure these are properly updated to ensure object lineage is as expected. Transmitted as an UTF-8 encoded XML structure for the respective type as defined in the DataONE types schema, as a File part of the MIME multipart/mixed message.
Returns:

The identifier that was used to insert the document into the system.

Return type:

Types.Identifier

Raises:

Examples

The outgoing request body must be encoded as MIME multipart/form-data with the system metadata portion and the object as file attachments.

(POST) Create a new object with a given identifier (XYZ33256):

curl -E /tmp/x509up_u502 \
     -F "pid=XYZ33256" \
     -F "object=@sciencemetadata.xml" \
     -F "sysmeta=@sysmeta.xml" \
     https://m1.dataone.org/mn/v1/object

HTTP/1.1 200 Success
Content-Type:
Date: Wed, 16 Dec 2009 13:58:34 GMT
Content-Length: 355

XYZ33256

The system metadata included with the create call must contain values for the elements required to be set by clients (see System Metadata). The system metadata document can be crafted by hand or preferably with a tool such as generate_sysmeta.py which is available in the d1_instance_generator Python package. See documentation included with that package for more information on its operation.

For example, the system metadata document for the example above was generated using the sequence of commands:

<<log on to cilogon.org and download my certificate>>

MYSUBJECT=`python my_subject.py /tmp/x509up_u502`
echo $MYSUBJECT

CN=Dave Vieglais T799,O=Google,C=US,DC=cilogon,DC=org

python generate_sysmeta.py -f sciencemetadata.xml \
                           -i "XYZ33256" \
                           -s "$MYSUBJECT" \
                           -t "eml://ecoinformatics.org/eml-2.0.1" \
                            > sysmeta.xml

The generated system metadata document contains default information that indicates:

  • The submitter is CN=Dave Vieglais T799,O=Google,C=US,DC=cilogon,DC=org
  • The rights holder is the same as the submitter
  • The access policy indicates public read, and write by the submitter
  • The replication policy indicates replication is allowed to any node

The generated system metadata document is presented below:

<?xml version='1.0' encoding='UTF-8'?>
<ns1:systemMetadata xmlns:ns1="http://ns.dataone.org/service/types/v1">
  <identifier>XYZ33256</identifier>
  <formatId>eml://ecoinformatics.org/eml-2.0.1</formatId>
  <size>22936</size>
  <checksum algorithm="MD5">2ec0084d1e11e0d5c9a46ba6a230aa85</checksum>
  <submitter>CN=Dave Vieglais T799,O=Google,C=US,DC=cilogon,DC=org</submitter>
  <rightsHolder>CN=Dave Vieglais T799,O=Google,C=US,DC=cilogon,DC=org</rightsHolder>
  <accessPolicy>
    <allow>
      <subject>public</subject>
      <permission>read</permission>
    </allow>
    <allow>
      <subject>CN=Dave Vieglais T799,O=Google,C=US,DC=cilogon,DC=org</subject>
      <permission>changePermission</permission>
    </allow>
  </accessPolicy>
  <replicationPolicy replicationAllowed="true"/>
  <dateUploaded>2012-02-20T20:39:19.664495</dateUploaded>
  <dateSysMetadataModified>2012-02-20T20:39:19.70598</dateSysMetadataModified>
</ns1:systemMetadata>
MNStorage.update(session, pid, object, newPid, sysmeta) → Identifier

This method is called by clients to update objects on Member Nodes.

Updates an existing object by creating a new object identified by newPid on the Member Node which explicitly obsoletes the object identified by pid through appropriate changes to the SystemMetadata of pid and newPid.

The Member Node sets Types.SystemMetadata.obsoletedBy on the object being obsoleted to the pid of the new object. It then updates Types.SystemMetadata.dateSysMetadataModified on both the new and old objects. The modified system metadata entries then become available in MNRead.listObjects(). This ensures that a Coordinating Node will pick up the changes when filtering on Types.SystemMetadata.dateSysMetadataModified.

The update operation MUST fail with Exceptions.InvalidRequest on objects that have the Types.SystemMetadata.archived property set to true.

A new, unique Types.SystemMetadata.seriesId may be included when beginning a series, or a series may be extended if the newPid obsoletes the existing pid.

Version:

1.0

Use Cases:

UC16

REST URL:

PUT /object/{pid}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • pid (Types.Identifier) – The identifier of the object that is being updated. If this identifier does not exist in the system, an error is raised and the operation does not cause any changes to the objects or their metadata. Transmitted as part of the URL path and must be escaped accordingly.
  • object (bytes) – The bytes of the data or science metadata object that will be deprecating the exsting object.
  • newPid (Types.Identifier) – The identifier that will become the replacement identifier for the existing object after the update. This identifier must have been previously reserved. Transmitted as a UTF-8 String as a Param part of the MIME multipart/mixed message.
  • sysmeta (Types.SystemMetadata) – A System Metadata document describing the new object. The SystemMetadata.obsoletes field must contain the identifier of the object being obsoleted. Other required client provided fields as described for Types.SystemMetadata must be filled. Transmitted as an UTF-8 encoded XML structure for the respective type as defined in the DataONE types schema, as a File part of the MIME multipart/mixed message.
Returns:

The identifier of the document that is replacing the original, which should be the same as newPid.

Return type:

Types.Identifier

Raises:
  • Exceptions.NotAuthorized(errorCode=401, detailCode=1200)
  • Exceptions.IdentifierNotUnique

    (errorCode=409, detailCode=1220)

    The requested identifier is already used by another object and therefore can not be used for this object. Clients should choose a new identifier that is unique and retry the operation.

  • Exceptions.UnsupportedType

    (errorCode=400, detailCode=1240)

    The MN can not deal with the object provided.

  • Exceptions.InsufficientResources

    (errorCode=413, detailCode=1260)

    The MN is unable to execute the transfer because it does not have sufficient storage space for example.

  • Exceptions.NotFound

    (errorCode=404, detailCode=1280)

    The update operation failed because the object which was supposed to be updated in the system (indicated via the obsoletedPid parameter) is not present in the DataONE system, so update is an illegal operation.

  • Exceptions.InvalidSystemMetadata

    (errorCode=400, detailCode=1300)

    One or more required fields are not set, the metadata document is malformed or the value of some field is not valid. SystemMetadata.obsoletes is set by the client and does not match the pid of the object being obsoleted. SystemMetadata.obsoletedBy is set on the SystemMetadata of the new object provided by the client (a new object cannot be created in an obsoleted state). SystemMetadata.obsoletedBy is already set on the object being obsoleted (no branching is allowed in the obsolescence chain).

  • Exceptions.ServiceFailure(errorCode=500, detailCode=1310)
  • Exceptions.InvalidToken(errorCode=401, detailCode=1210)
  • Exceptions.NotImplemented(errorCode=501, detailCode=1201)
  • Exceptions.InvalidRequest

    (errorCode=400, detailCode=1202)

    Raised when the request parameters are incorrect or the operation is not applicable to the current state of the object (e.g. an archived object can not be updated)

MNStorage.generateIdentifier(session, scheme[, fragment]) → Identifier

Given a scheme and optional fragment, generates an identifier with that scheme and fragment that is unique. Maybe be used for generating either PIDs or SIDs.

The message body is encoded as MIME Multipart/form-data

Version:

1.0

REST URL:

POST /generate

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • scheme (string) – The name of the identifier scheme to be used, drawn from a DataONE-specific vocabulary of identifier scheme names, including several common syntaxes such as DOI, ARK, LSID, UUID, and LSRN, among others. The first version of this method only supports the UUID scheme, and ignores the fragment parameter. Transmitted as a UTF-8 String as a Param part of the MIME multipart/mixed message.
  • fragment (string) – The optional fragment to include in the generated Identifier. This parameter is optional and may not be present in the message body. Transmitted as a UTF-8 String as a Param part of the MIME multipart/mixed message.
Returns:

The identifier that was generated

Return type:

Types.Identifier

Raises:

Todo

Need to provide a list of recommended identifier schemes.

MNStorage.delete(session, id) → Identifier

Deletes an object managed by DataONE from the Member Node. Member Nodes MUST check that the caller (typically a Coordinating Node) is authorized to perform this function.

The delete operation will be used primarily by Coordinating Nodes to help manage the number of replicas of an object that are present in the entire system.

The operation removes the object from further interaction with DataONE services. The implementation may delete the object bytes, and in general should do so since a delete operation may be in response to a problem with the object (e.g. it contains malicious content, is innappropriate, or is the subject of a legal request).

If the object does not exist on the node servicing the request, then an Exceptions.NotFound exception is raised. The message body of the exception SHOULD contain a hint as to the location of the CNRead.resolve() method.

Version:

1.0

Use Cases:

UC16

REST URL:

DELETE /object/{id}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • id (Types.Identifier) – The identifier of the object to be deleted. May be either a PID or a SID. Will delete the HEAD PID when called with a SID. Transmitted as part of the URL path and must be escaped accordingly.
Returns:

The identifier of the object that was deleted.

Return type:

Types.Identifier

Raises:
MNStorage.archive(session, id) → Identifier

Hides an object managed by DataONE from search operations, effectively preventing its discovery during normal operations.

The operation does not delete the object bytes, but instead sets the Types.SystemMetadata.archived flag to True. This ensures that the object can still be resolved (and hence remain valid for existing citations and cross references), though will not appear in searches.

Objects that are archived can not be updated through the MNStorage.update() operation.

Archived objects can not be un-archived. This behavior may change in future versions of the DataONE API.

Member Nodes MUST check that the caller is authorized to perform this function.

If the object does not exist on the node servicing the request, then an Exceptions.NotFound exception is raised. The message body of the exception SHOULD contain a hint as to the location of the CNRead.resolve() method.

Version:

1.0

REST URL:

PUT /archive/{id}

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • id (Types.Identifier) – The identifier of the object to be archived. May be either a PID or a SID. Will archive the HEAD PID when called with a SID. Transmitted as part of the URL path and must be escaped accordingly.
Returns:

The identifier of the object that was archived.

Return type:

Types.Identifier

Raises:
MNStorage.updateSystemMetadata(session, pid, sysmeta) → boolean

Provides a mechanism for updating system metadata for any objects held on the Member Node where that Member Node is the authoritative Member Node. Coordinating Node can call this method on the non-authoritative Member Node. However, this is not a normal operation and is for the special case - the authoritative Member Node doesn’t exist any more. Coordinating Node calling the method on the non-authoriative Memember Node in the normal operation can cause an unexpected consequence.

This method is typically used by Authoritative Member Node or rights holder[s] to ensure system metadata quality.

Version:

2.0

REST URL:

PUT /meta

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • pid (Types.Identifier) – Transmitted as a UTF-8 String as a Param part of the MIME multipart/mixed message.
  • sysmeta (Types.SystemMetadata) – Transmitted as an UTF-8 encoded XML structure for the respective type as defined in the DataONE types schema, as a File part of the MIME multipart/mixed message.
Returns:

True if the update was successful.

Return type:

boolean

Raises:

Replication API

The Replication API provides methods to support CN-directed replication of content between MNs.

Functions defined in MNReplication
Tier Version REST Function Parameters
Tier 4 1.0 POST /replicate replicate() (session, sysmeta, sourceNode) -> boolean
MNReplication.replicate(session, sysmeta, sourceNode) → boolean

Called by a Coordinating Node to request that the Member Node create a copy of the specified object by retrieving it from another Member Nodeode and storing it locally so that it can be made accessible to the DataONE system.

A successful operation is indicated by a HTTP status of 200 on the response.

Failure of the operation MUST be indicated by returning an appropriate exception.

Access control for this method MUST be configured to allow calling by Coordinating Nodes.

Version:

1.0

Use Cases:

UC09

REST URL:

POST /replicate

Parameters:
  • session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate which must be traceable to the CILogon service. The subject of the session defaults to the public user if the certificate was not provided with the request. Transmitted as part of the SSL handshake process.
  • sysmeta (Types.SystemMetadata) – Copy of the CN held system metadata for the object. Transmitted as an UTF-8 encoded XML structure for the respective type as defined in the DataONE types schema, as a File part of the MIME multipart/mixed message.
  • sourceNode (Types.NodeReference) – A reference to node from which the content should be retrieved. The reference should be resolved by checking the CN node registry. Transmitted as a UTF-8 String as a Param part of the MIME multipart/mixed message.
Returns:

True if everything works OK, otherwise an error is returned.

Return type:

boolean

Raises:

Response

The response should be a valid HTTP response with a blank or arbitrary body. Only the HTTP header information is considered by the requestor. A successful response must have a HTTP status code of 200. In case of an error condition, the appropriate HTTP status code must be set, and an exception or error information may be returned in the response.

The outgoing request body must be encoded as MIME multipart/form-data with the system metadata portion as a file attachment and the sourceNode parameter as a form field.

curl -v -X POST "https://localhost:8000/mn/v1/replicate" \
  -H "Content-type: multipart/form-data" \
  -F "sysmeta=@systemmetadata.xml" \
  -F "sourceNode=urn:node:MN_B"

* About to connect() to localhost port 8000 (#0)
*   Trying ::1... Connection refused
*   Trying fe80::1... Connection refused
*   Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 8000 (#0)
> POST /mn/v1/replicate HTTP/1.1
> User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
> Host: localhost:8000
> Accept: */*
> Content-Length: 1021
> Expect: 100-continue
> Content-type: multipart/form-data; boundary=----------------------------88ffdd8070e9
>
* Done waiting for 100-continue
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Date: Fri, 14 Jan 2011 22:01:13 GMT
< Server: WSGIServer/0.1 Python/2.6.1
< Content-Type: text/xml
<
<
* Closing connection #0