Warning: These documents are under active
development and subject to change (version 2.1.0-beta).
The latest release documents are at:
https://purl.dataone.org/architecture
Develop a generalized selector API and standard that can be used to identify granules within a data resource (data set, data package, composite object) that is managed by DataONE. The “selector” can be appended to an identifier or included with an API call to retrieve an object and the response is the sub-element or component specified by the selector.
Types of selector are likely to vary with the types of composite objects being managed by DataONE. For example, a selector may be defined to return a range by bytes from a BLOB, or perhaps a single file from within a zip archive of a set of files.
Some examples:
There are practical limits to the total number of objects that may be effectively managed at the level of detail supported by the DataONE infrastructure. By managing content at the collection or package level, the total number of managed identifiers can be constrained to a more manageable range (e.g. a single collection may have >= 1e5 elements or records). Using selectors, this could be reduced to a single identifier for the collection plus support for the appropriate selector.
So a “selector service” is implemented by a node as a mechanism for retrieving some object that exists as a sub-component of a larger object that is managed by the DataONE infrastructure.