Warning: These documents are under active
development and subject to change (version 2.1.0-beta).
The latest release documents are at:
https://purl.dataone.org/architecture
This document is OBSOLETE, and has been superseded by information in the DataONE types schema. It will be deleted after review.
A NodeList is a synchronized register for all of the nodes in the DataONE environment. It contains the information needed by DataONE to orchestrate activities across the distributed coordinating and member nodes of the network. While some information is provided by the Member Nodes themselves, the node list is maintained dynamically by the Coordinating Nodes. The node list is mutable in that it reflects the latest state of the nodes that are part of the system. Replicated copies of the node list are maintained at each of the Coordinating nodes.
Registry
ContactGroup
groupid
name
description
members
Contact
contactid
role (administrator, manager, ...)
givenName (first name)
sn (surname)
notification
type (phone, email, IRC, ...)
connection (phone number, email address, IRC channel)
Network (1..n, replaces "environment")
networkid
name
description
adminGroup
notifyGroup
Node
nodeid
name
description
location
adminGroup
notifyGroup
created (date created / registered)
modified (time stamp for modification)
lastSynchronization (time stamp)
objectFormatsSupported (list of object formats known to support)
synchronize
replicate
replicationTarget
service
version (schema version supported, MN)
baseURL (MN)
name (human readable name for service, e.g. "DataONE-0.6.1", MN)
activeNetwork (id of network this interface is active for, MN)
lastChecked (last time service was examined, CN)
method
name (MN)
isactive (set by CN)
The node list is a complex data type, with three main sub-structures: services, synchronization, and health. Some data is provided at node registration time, while other items are generated by DataONE itself in the course of managing objects.
The nodelist schema is expressed in XMLSchema and is available at:
The following list of fields represents the set of information collected and maintained by Coordinating Nodes for every node in the system.
Table 1. Quick reference to the NodeList fields described in more detail below.
Group | Field | Type | Cardinality | Generate By | Version |
---|---|---|---|---|---|
General | |||||
identifier |
NodeReference | 1 | CN | 0.5 | |
name |
NonEmptyString | 1 | CN | 0.5 | |
description |
NonEmptyString | 1 | CN | 0.5 | |
baseURL |
anyURI | 1 | MN | 0.5 | |
services |
Service | 0..n | MN | 0.5 | |
synchronization |
Synchronization | 0..1 | CN | 0.5 | |
health |
NodeHealth | 0..1 | CN | 0.5 | |
replicate |
boolean | 1 | MN | 0.5 | |
synchronize |
boolean | 1 | MN | 0.5 | |
type |
NodeType | 1 | CN | 0.5 | |
environment |
Environment | 1 | CN | 0.5 | |
Services | |||||
services.name |
ServiceName | 1 | MN | 0.5 | |
services.version |
string | 1 | MN | 0.5 | |
services.available |
boolean | 0..1 | MN | 0.5 | |
services.method |
ServiceMethod | 0..n | MN | 0.5 | |
services.method.name |
NMToken | 0..1 | CN | 0.5 | |
services.method.rest |
xs:token | 1 | MN | 0.5 | |
services.method.implemented |
boolean | 1 | MN | 0.5 | |
Synchronization | |||||
synchronization.lastHarvested |
dateTime | 1 | CN | 0.5 | |
synchronization.lastCompleteHarvest |
dateTime | 1 | CN | 0.5 | |
synchronization.schedule |
Schedule | 1 | CN | 0.5 | |
synchronization.schedule.sec |
crontabEntryType | 1 | CN | 0.5 | |
synchronization.schedule.min |
crontabEntryType | 1 | CN | 0.5 | |
synchronization.schedule.hour |
crontabEntryType | 1 | CN | 0.5 | |
synchronization.schedule.mday |
crontabEntryType | 1 | CN | 0.5 | |
synchronization.schedule.mon |
crontabEntryType | 1 | CN | 0.5 | |
synchronization.schedule.year |
crontabEntryType | 1 | CN | 0.5 | |
synchronization.schedule.wday |
crontabEntryType | 1 | CN | 0.5 | |
Health | |||||
health.ping |
Ping | 1 | CN | 0.5 | |
health.status |
Status | 1 | CN | 0.5 | |
health.state |
State | 1 | CN | 0.5 | |
health.ping.success |
boolean | 0..1 | CN | 0.5 | |
health.ping.lastSuccess |
dateTime | 0..1 | CN | 0.5 | |
health.status.success |
boolean | 0..1 | CN | 0.5 | |
health.status.dateChecked |
dateTime | 0..1 | CN | 0.5 |
NodeList.
identifier
¶A unique identifier for the node of type NodeReference. This may initially be the same as the baseURL, however this value should not change for future implementations of the same node, whereas the baseURL may change in the future.
Cardinality: | 1 |
---|---|
ValueSpace: | NodeReference |
Generated By: | CN |
Required Version: | |
0.5 |
NodeList.
name
¶A human readable name for the node. (The name of the node is being used in Mercury currently to assign a path, so the format should be consistent with dataone directory naming conventions).
Cardinality: | 1 |
---|---|
ValueSpace: | NonEmptyString |
Generated By: | CN |
Required Version: | |
0.5 |
NodeList.
description
¶Description of content maintained by this node and any other free style notes.
Cardinality: | 1 |
---|---|
ValueSpace: | NonEmptyString |
Generated By: | CN |
Required Version: | |
0.5 |
NodeList.
baseURL
¶Of type anyURI, it is the base URL that is complete enough with the service.method.rest attribute to create a valid call.
Cardinality: | 1 |
---|---|
ValueSpace: | anyURI |
Generated By: | CN |
Required Version: | |
0.5 |
NodeList.
replicate
¶A flag to tell the CN whether or not to replicate MN data.
Cardinality: | 1 |
---|---|
ValueSpace: | boolean |
Generated By: | CN |
Required Version: | |
0.5 |
NodeList.
synchronize
¶A flag to tell the CN to synchronize or not. Applies to CNs and MNs (although CNs are presumed to synchronize)
Cardinality: | 1 |
---|---|
ValueSpace: | boolean |
Generated By: | CN |
Required Version: | |
0.5 |
NodeList.
type
¶The type of node in the dataONE world this one is. Legal values are “MN” and “CN”.
Cardinality: | 1 |
---|---|
ValueSpace: | NodeType |
Generated By: | CN |
Required Version: | |
0.5 |
NodeList.
environment
¶The systems environment the node belongs to. Legal values are “dev”, “test”, “staging”, and “prod”.
Cardinality: | 1 |
---|---|
ValueSpace: | Environment |
Generated By: | CN |
Required Version: | |
0.5 |
services.
name
¶The name of the service exposed by the node
Cardinality: | 1 |
---|---|
ValueSpace: | ServiceName |
Generated By: | CN |
Required Version: | |
0.5 |
services.
version
¶The version of the service implemented. Since not all member nodes can be orchestrated to migrate versions simultaneously, the version is needed to ensure business continuity in the eventuality of dataone-service-api upgrades.
Cardinality: | 1 |
---|---|
ValueSpace: | string |
Generated By: | CN |
Required Version: | |
0.5 |
services.
available
¶A flag to indicate whether or not the service is available. Determined by the CN.
Cardinality: | 0..1 |
---|---|
ValueSpace: | boolean |
Generated By: | CN |
Required Version: | |
0.5 |
services.method.
name
¶the name of the method implemented by the service
Cardinality: | 0..1 |
---|---|
ValueSpace: | NMToken |
Generated By: | CN |
Required Version: | |
0.5 |
services.method.
rest
¶the rest path, relative to the baseURL of the node, that calls the method
Cardinality: | 1 |
---|---|
ValueSpace: | xs:token |
Generated By: | CN |
Required Version: | |
0.5 |
services.method.
implemented
¶A flag to indicate if this method is implemented on the node. Determined by the MN through the addCapabilities method.
Cardinality: | 1 |
---|---|
ValueSpace: | boolean |
Generated By: | CN |
Required Version: | |
0.5 |
synchronization.
lastHarvested
¶Set by a CN, contains the time of last MN-synchronization with a CN. The dateTime is taken from the frame of reference of the member node, that is to say, it uses the latest modification date from the objects harvested.
Cardinality: | 1 |
---|---|
ValueSpace: | dateTime |
Generated By: | CN |
Required Version: | |
0.5 |
synchronization.
lastCompleteHarvest
¶Set by a CN, contains the time of the last complete harvest from a MN. A complete harvest is a full re-harvesting from a member node not relying on last harvest time. This value of this field should always be the same or earlier than the lastHarvested field.
Cardinality: | 1 |
---|---|
ValueSpace: | dateTime |
Generated By: | CN |
Required Version: | |
0.5 |
synchronization.
schedule
¶a set of numerical list or range values used to set the synchronization schedule with a MN, following crontab formatting rules. See wikipedia entry for a popular, if not technical, explanation of crobtab http://en.wikipedia.org/wiki/Cron.
Cardinality: | 1 |
---|---|
ValueSpace: | Schedule |
Generated By: | CN |
Required Version: | |
0.5 |
health.
state
¶The state of health of the node, based on ping and status calls. Legal values are “up”, “down”, “unknown”.
Cardinality: | 1 |
---|---|
ValueSpace: | State |
Generated By: | CN |
Required Version: | |
0.5 |
health.ping.
success
¶A flag showing whether the last mn_health.ping was successful or not.
Cardinality: | 0..1 |
---|---|
ValueSpace: | boolean |
Generated By: | CN |
Required Version: | |
0.5 |
health.ping.
lastSuccess
¶The time of last successful mn_health.ping to the node.
Cardinality: | 0..1 |
---|---|
ValueSpace: | dateTime |
Generated By: | CN |
Required Version: | |
0.5 |
health.status.
success
¶A flag showing whether the last mn_health.status method call was successful or not.
Cardinality: | 0..1 |
---|---|
ValueSpace: | boolean |
Generated By: | CN |
Required Version: | |
0.5 |
health.status.
dateChecked
¶The time of the last mn_health.status call to the node.
Cardinality: | 0..1 |
---|---|
ValueSpace: | dateTime |
Generated By: | CN |
Required Version: | |
0.5 |
The object format in protocol buffer format A set of values that describe a node, its Internet location, the services it supports and its replication policy.
message Node
{
required NodeReference identifier = 1;
required NonEmptyString name = 2;
required NonEmptyString description = 3;
required anyURI baseURL = 4;
repeated Service services = 5;
optional Synchronization synchronization = 6;
optional NodeHealth health = 7;
required boolean replicate = 8;
required boolean synchronize = 9;
required NMToken(string) type = 10;
message Service
{
required ServiceName name = 0;
required string version = 1;
boolean available = 2;
repeated ServiceMethod method = 3;
message ServiceMethod
{
optional NMToken name = 0;
required xs:token rest = 1;
required boolean implemented = 2;
}
}
message Synchronization
{
required dateTime lastHarvested = 0;
required dateTime lastCompleteHarvest = 1;
required Schedule schedule = 2;
message Schedule
{
required crontabEntryType sec = 0;
required crontabEntryType min = 1;
required crontabEntryType hour = 2;
required crontabEntryType mday = 3;
required crontabEntryType mon = 4;
required crontabEntryType year = 5;
required crontabEntryType wday = 6;
}
}
message NodeHealth
{
required Ping ping = 0;
required Status status = 1;
required State state = 2;
message Ping
{
optional boolean success = 0;
optional dateTime lastSuccess = 1;
}
message Status
{
optional boolean success = 0;
optional dateTime dateChecked = 1;
}
enum State
{
UP = 0;
DOWN = 1;
UNKNOWN = 2;
}
}
}