Warning: These documents are under active
development and subject to change (version 2.1.0-beta).
The latest release documents are at:
https://purl.dataone.org/architecture
The DataONE system should log various interactions and operations in the system
to provide operational status information about the entire system, to report on
specific node operations, and to inform DataONE participants (users,
contributors, administrators) about their specific domain of interest in the
system. For example, a contributor might like to monitor use of their data and
where it is being replicated to. The methods MNCore.getLogRecords()
and
CNCore.getLogRecords()
provide outward facing services for retrieving
log information from member and coordinating nodes respectively.
Logging is described in use cases 16, 17, 18, 20, and potentially 19.
The performance metrics survey results from the Leadership Team specify (at least) the following metrics should be captured. Items that may be captured from the CI portion of the project are indicated by !!!.
Size and Diversity of DataONE Data, Metadata, and Investigator Toolkit Holdings
- Recorded in:
- Total (including replicas) data volume + unique object data volume
- Sysmeta
- system metadata
- !!
- Mechanism to register external uses / implementations
- HTTP user-agent
- object format from sysmeta
DataONE System Capacity
- registry
- per member node
- total
- part of MN registration and capabilities
- disk space
- capabilities / metadata
DataONE Usage Statistics
- standard web hits from logs
- aggregated across CNs
- Same as 16
Reliability and System Performance
- Need a monitoring service in addition to the CN service
- also need to consider geographic accessibility (users)
- Same as 18
- REST service performance
- Define a bunch of test queries that can be executed in parallel for load testing.
- Time for “page load” vs. number of concurrent users
- Time for specific operations (test queries, test renderings, ...)
Community Engagement
Education and Outreach
Sustainability
The following bullets represent the union of logging information indicated in the use cases and the metrics that can be reported from the logs. The information logged and suitable summarization and extraction procedures need to be identified to ensure the following items can be addressed: