.. _ReplicationOverview:
Replication Overview
====================
.. index:: Use Case 09, UC09, Replicate MN, replicate
Revision History
View document revision history_.
DataONE provides replication services to satisfy both data and metadata preservation needs and to provide the potential for fault-tolerance and load balancing services for data and metadata access. Tier 4 Member Nodes within the federation are set up to house replicas of content, and provide this service to other Member Nodes based on certain policy agreements. Replication is handled on a per-object basis within DataONE, with the RightsHolder and/or Authoritative Member Node controlling the ReplicationPolicy for each object, which determines whether it will be replicated. In addition, each Member Node decides whether it will accept replicas in general (by supporting the Tier 4 :class:`CNReplication` API and setting `replicate=true`), and can decide whether it will accept any given request to replicate an object. Coordinating Nodes monitor the :class:`Types.ReplicationPolicy` for each object in DataONE, and ensure that the appropriate :term:`replication target` nodes house an accurate replica of the object. Each replica of an object is recorded by the Coordinating Nodes, so when a consumer wishes to retrieve the object, they can use :func:`CNRead.resolve` to list the replicas and :func:`MNRead.get` to retrieve any of the replicas in the network.
Summary of Replication process
------------------------------
To fulfill the :class:`Types.ReplicationPolicy` for each object, the CN schedules each object to be replicated with one of the Tier 4 MNs that are willing to host replicas. Replication is an asynchronous, multi-step process, in order to allow for non-blocking replication of objects that would take more than a few seconds to copy over the network. The process originates with 1) the CN calling :func:`MNReplication.replicate` on the target MN, which is a request for the MN to replicate a particular object. The MN responds with a HTTP 200 if it is willing and able to attempt the replication and house the object, and the CN marks the replica request as REQUESTED. See :class:`Types.ReplicationStatus` for the definition of the status values. Then, 2) the target MN calls the source :func:`MNRead.getReplica` to request the bytes of the object, and if they are transferred correctly, then 3) calls :func:`CNReplication.setReplicationStatus` to indicate that the request has been COMPLETED, or if it FAILED. At this point the replication is finished. If the replication fails, the CN then requests that it be replicated elsewhere. If it succeeds, the CN will check in periodically with the MN to verify the checksum of the object held to confirm validity.
Object Replication Policy
-------------------------
The :class:`Types.ReplicationPolicy` for an object defines if replication should be attempted for this object, and if so, how many replicas should be maintained. It also permits specification of preferred and blocked nodes as potential replication targets.
If a ReplicationPolicy is provided in the :term:`System Metadata` for an object, then that policy is followed precisely by the Coordinating Nodes when managing replication. In the absence of a defined ReplicationPolicy for an object, DataONE will by default attempt to maintain two replicas for the object, as long as the object's size is below a threshold size that would allow transfer over networks in reasonable time periods. As network transfer capabilities improve among DataONE nodes, this threshold size will be increased.
.. code-block:: xml
urn:node:KNB
urn:node:PISCO
urn:node:SOMEBADNODE
Node Replication Policy
-----------------------
Nodes that wish to serve as a replication target and thereby are available to store replicas of data from around the network set :attr:`Types.Node.replicate` to 'true' in their :class:`Types.Node` description when registering their node. In addition, these nodes must support the Tier 4 :class:`CNReplication` API to allow Coordinating Nodes to perform all necessary operations. Nodes can express constraints on object size, total replication space available, source nodes, and object format types that a node will replicate by providing a :class:`Types.NodeReplicationPolicy` as part of a it's :class:`Types.Node` description. A node may choose to restrict replication from only certain peer nodes, may have file size limits, total allocated size limits, or may want to focus on being a :term:`replication target` for domain-specific object formats.
.. code-block:: xml
524288000
1099511627776
urn:node:KNB
urn:node:ESA
urn:node:SANPARKS
FGDC-STD-001.1-1999
eml://ecoinformatics.org/eml-2.1.1
text/csv
The :attr:`Types.NodeReplicationPolicy.maxObjectSize` indicates the maximum allowable size of an object to be replicated in bytes. The :attr:`Types.NodeReplicationPolicy.spaceAllocated` field sets an upper limit on space usage for replica storage on the given node. Once the spaceAllocated has been reached for a node, the Coordinating Nodes will no longer request that additional replicas be stored on that node. :attr:`Types.NodeReplicationPolicy.allowedNode` is used to list all nodes that are allowed to replicate to the target. If it is absent, then any node may replicate to the target. :attr:`Types.NodeReplicationPolicy.allowedObjectFormat` is used to list all object formats that may be replicated to the target. If it is absent, then any object format may be replicated to the target.
.. Note::
:class:`Types.NodeReplicationPolicy` is not currently implemented on the CN and so is ignored when making decisions as to which MN should be used for replication. In a future release, the CN scheduler will utilize the :class:`Types.NodeReplicationPolicy` to limit the types of objects that are scheduled to be replicated to a MN, but for now the information is not used at all.
.. _history: https://redmine.dataone.org/projects/d1/repository/changes/documents/Projects/cicore/architecture/api-documentation/source/design/ReplicationOverview.txt