status: under development
This document provides details about the process of becoming a DataONE member node. For an overview of the process, see Member Node Deployment Process on the main website.
The process of becoming a DataONE Member Node can be viewed as four phases: Planning, Developing, Testing, and Operating. While we present these as a linear process, for clarity and planning, steps are often done in parallel and a given organization may be doing some tasks in a later phase while the bulk of effort is in an earlier phase.
Plan the new Member Node.
Determine feasibility
Member Node representatives review the Member Node Documentation, in particular the DataONE Partnership Guidelines and determine if a partnership with DataONE makes sense, and if the organization has the resources required for successfully implementing and operating a MN. Member Nodes can ask DataONE for information or help via the Contact Us page on the DataONE website.
Join the DataONE federation
Scope the implementation
The decisions made during this step will drive the details of planning the implementation below.
Data: First, the MN should decide how much of and what data they wish to make discoverable via DataONE. Some MNs choose to expose all their data, others only some, and still others expose all their data to a limited audience.
The MN should also consider the mutability of their data; i.e. is their data static or continuously updated, or a combination of these characteristics.
DataONE Functionality: In conjunction with defining the scope of their holdings made visible via DataONE, the MNs also must Select the DataONE Tier
Member Nodes choose to expose various services, which we have organized into four tiers, starting with the simple read only access (Tier 1) and progressing through more complex services including authentication (Tier 2), write access (Tier 3), and replication (Tier 4). Select the level of functionality that the MN will provide as a partner in the DataONE infrastructure.
Member Node Software Stack: Decide if the MN will be fully or partially based on an existing software stack, such as Metacat or GMN, or if a completely custom implementation is required, or if a hybrid approach will be used to adapt an existing DataONE compatible software system to interact with an existing repository system.
After determining the scope of data holdings to be exposed via DataONE and the related questions above, the MN will determine the best approach for the MN implementation.
The MN will need to plan for any needed infrastructure changes at their site.
- Data: if not all data holdings will be made discoverable via DataONE, the MN will need to plan/develop a mechanism to identify what data is to be harvested or create a subset of data for DataONE use.
In any case, each data object will need to be assigned a DOI if not already assigned one “locally”.
Functionality: Based on the desired Tier of operations, the MN may need to implement additional [security measures - this isn’t the right way to say this].
Software Stack/other development: Depending on resource requirements for any software development (i.e. new/modified software stack), the MN should plan to allocate appropriate (human) resources to the effort.
Determine if there will be new data formats or new metadata formats which need to be registered. An example of this might be [put an example here]. If there is no software stack development or no new data/metadata formats to be registered, the Developing phase will be less costly in terms of time and resources.
Define a data management plan. If the MN already has an institutional DMP in place, this may be used or modified to reflect interactions with DataONE.
Consider the question of persistent identifiers (related to the mutability of data issue). See Identifiers in DataONE.
The scope of the developing phase is to build and test a working member node that passes the basic tests in the web-based Member Node Tester. The main things to put in place are the member node itself and any formats that would be new to DataONE.
Develop MN Software
Unless you are fortunate to already be using Metacat, or don’t have an existing data collection, developing the Member Node usually involves writing at least some integration code, and for some organizations, implementing the API methods themselves. At this point in the process you will be simply following your development plan.
You can iteratively use the web-based Member Node testing service throughout your development process to measure incremental progress.
Register Formats
If you are working with a format new to DataONE, it will need to be registered before D1 can successfully synchronize content registered with that format. This is a distinct process that is also set up to run outside of Member Node deployment. If you are registering a new metadata format, DataONE developers will need to build, test, and deploy an indexing parser and html renderer to the CNs. Testing these elements is best done in DEV, with the content of the new format originating either from the new member node or by submitting sample content to an existing node in the DEV environment. This decision should be discussed with coredev.
Passing MN Tests?
Once the required tests of the Member Node testing service are passing, (see Development Testing) the prospective Member Node is ready to enter the Testing phase, where more thorough testing is done.
Once all data formats are registered and your software is fully developed, whether by yourself or by utilizing an existing MN software stack, you can then deploy and configure your node and register it to our Stage environment to allow us to conduct a real-world test in an environment that is identical to the Production environment. The end-point of this phase is a fully functional and integrated Member Node “in production”.
Test in STAGE
STAGE testing allows DataONE to conduct a real-world tests in an environment that is identical to the Production environment. It is the first time that the entire Member Node’s content is synchronized, so this is the place where non-systematic content issues are usually revealed. Configuration issues are also identified here, especially related to certificates and user subjects.
STAGE testing involves the following steps:
Deploy in Production Environment
After successful testing in the Stage environment, the MN can be deployed and registered in the Production environment (see Register in Production). Registering the MN in the Production environment is the final technical step required for DataONE to approve the node and for it to enter into operational status.
Mutual Acceptance
After the node is registered in the Production environment, both the node operators and DataONE will do a final review on the node to determine that it is operating as expected. This includes checks for content disparities and other issues that may not be detected by the automated tests. The node description and other metadata are checked for consistency and clarity. When the review is complete, both DataONE and the node operators mutually approve the registration and move the MN into an operational state.
Operate the MN in production.
Announcement
The MN organization announces the new MN and DataONE showcases the MN through channels such as the DataONE newsletter and press releases.
Ongoing Production operations
The MN is operational and delivers services to the broader research community. Coordinating nodes monitor the MN to ensure that it operates as intended. The node’s data and metadata are made available via the various MN and Coordinating MN services. Logs are kept on all services provided, and the Coordinating nodes provide automated access to aggregated statistics back to the MN operators.
Participate in MN forums
The MN organization participates in MN forums to help monitor and evolve the DataONE federation to meet the research data needs of the community.