Warning: These documents are under active
development and subject to change (version 2.1.0-beta).
The latest release documents are at:
https://purl.dataone.org/architecture
Contents
DataONE nodes are of two types, Coordinating Nodes and Member Nodes. Member Nodes are data and metadata providers that serve particular communities and that agree to interoperate with other nodes using the DataONE Service Interface. Coordinating Nodes provide services to each other and to the network of Member Nodes to enable DataONE to function as an integrated federation.
Each node in DataONE is assigned a unique, immutable identifier which serves to link all information about the node together in the system. References in various metadata documents in DataONE always utilize this NodeReference, as this will remain constant even as protocols and service endpoints evolve over time. Thus, while the URL endpoint for a node’s services may change over time, possibly even moving across domains, the NodeReference will always be constant. The DataONE NodeReference takes the following form:
NodeReference = urn ":" node ":" identifier
urn = "urn"
node = "node"
identifier = *( idchars )
idchars = ALPHA / DIGIT / "_"
ALPHA and DIGIT are patterns representing the upper and lower ASCII letters
[A-Za-z] and the ASCII digits [0-9], defined in the ABNF standard. Thus,
urn:node:
is a constant prefix, always in lowercase, and identifier
is
a short, unique name for the node that is case sensitive. For example, valid
NodeReferences might include:
urn:node:KNB
urn:node:DRYAD
urn:node:CN_UCSB
By policy, the length of nodes identifiers will generally be restricted to 25
characters, inclusive of the urn:node:
prefix, and will be reviewed for
appropriateness for the node during the node approval process (see Node
Registration below).
In this case, appropriateness means concise, memorable, and durable. In general, the identifier should not contain terms that are likely to change over the very long term - implementation details such as host names, software service names, and versions. Identifier length is restricted to make it easy for system administrators and other programmers to read, recall, and type them. DataONE UI’s will make use of the name field of the Node record for display, so the identifier does not have to be meaningful for end-users.
In order to become a Member Node (or Coordinating Node) in DataONE, the node must be authenticated by DataONE in order to securely communicate with other DataONE nodes. One of the first steps in preparing the node for registration is receiving a DataONE certificate that will be used for negotiating secure connections with other nodes. This certificate is an X.509 certificate that is backed by a cryptographic key. The certificate will contain a distinguished name, that is included as the subject field in the node record. Over time, these node certificates will expire and will need to be renewed by installing the new certificate on the Member Node, and updating the subject field if necessary. The Node record provided in DataONE can contain a list of subjects representing the node, each corresponding to a valid DataONE certificate installed on the node that can be used for authentication.
In addition, every node must have a contact person with whom DataONE can communicate about DataONE operations (such as new node certificates) and policies as needed. This contact person must be registered and verified with DataONE prior to registration.
Registration as a node in the DataONE network is accomplished by registering
as a Member Node (or Coordinating Node) through an existing Coordinating Node
registration service (see CNRegister.register()
). This service takes a
Types.Node
description as input, including a proposed
Types.NodeReference
for this node and additional metadata such as the
nodeContact in the Node description. If the NodeReference is syntactically
correct and is unique, and the nodeContact is a verified account registered
with DataONE, then the registration service will successfully return the
Types.NodeReference
value for this node, which is then permanently
assigned and can not be reused or reassigned. At this point, the
Types.Node
has been registered but has not yet been approved. The
request to become a node will be reviewed by DataONE, and, if approved, will
be added to the list of Nodes in the federation. At this point, the Node will
be be able to participate in all synchronization and replication services
available in DataONE.
Along with the production environment, DataONE maintains other environments of inter-communicating Coordinating and Member Nodes for various testing purposes. Aside from a unique list of nodes, each environment maintains their own sets of data objects, object formats, and user accounts. The registration steps described below pertain to a single environment, so registering a node to a new environment would require running through the procedure in its entirety for the new environment.
Step 1: Stand-alone testing
Prior to registration, the node needs to be tested for proper functionality of its services, and proper form of its content. Certain integration tests used by the core team have been deployed to a web server (http://mncheck.test.dataone.org) so member node implementers can test basic services in a stand-alone environment.
Step 2: Content checking
Not every aspect of the node can be checked prior to testing, and some tests take too long to be automated in a web-based platform. Also better done prior to node registration, content checking should be done to make sure that:
This step is best done in close coordination with the DataONE core developer team.
Step 3: Node Registering
Registering the node involves the following steps.
- Registering the nodeContact account with the environment via the identity portal. This account needs to be one compatible with CiLogon.
Using the portal
go to
https://cn-{ENVIRONMENT}.dataone.org/portal
choose your account provider (this step may be bypassed if you have already logged in
At the My Account tab, fill out the Account Details fields, and click “Register.” (This will register this account and display the subject. If there is no button labeled “Register”, but one labeled “Update”, your account is already registered.)
The subject displayed is the part within the parentheses, in the format “CN=foo,DC=cil ogon,DC=org”, and it is this value that must match what is in the Node record’s subject field.
Submitting a cn.register(Session, Node) request, where the Session parameter contains the certificate of the person making the request, and the Node parameter is, in most cases, the Node record served by the mn.getCapabilities() service call (
GET /node
). Problems with the node record will be reported back as an exception.Approving the node.
- Contact the DataONE contact person that the node has been registered and ready for approval.
- Review any content checking test results with the node contact.
- DataONE will approve the node.
Step 4: Functional Integration testing (except in PROD environment)
At this point, the appropriate multi-node functional tests (for synchronization, replication, and updateSystemMetadata) will be run. Tests in this arena are intended to shake out remaining bugs, and will in most cases be done in close coordination with the DataONE core developers team. Success at this step requires a dedicated developer resource from the member node implementation team for about a 1-2 week period, as bug fixing at this point tends to be sequential.