This document is licensed under a Creative Commons Attribution 3.0 License.
Provenance describes the origin and processing history of an artifact. Data provenance is an important form of metadata that explains how a particular data product was generated, by detailing the steps in the computational process producing it. Provenance information brings transparency and helps to audit and interpret data products. The state of the art scientific workflow systems (e.g. Kepler, Taverna, VisTrails, etc.) provide environments for specifying and enacting complex computational pipelines commonly referred to as scientific workflows. In such systems, provenance information is automatically captured in the form of execution traces. However, they often rely on proprietary formats that make the interchange of provenance information difficult. Furthermore, the workflow itself, which represents very useful information, may be disregarded in provenance traces. The evolution history of the workflow (i.e. its provenance) can likewise be missing. To address these shortcomings we propose ProvONE, a standard for scientific workflow provenance representation. ProvONE is defined as an extension of the W3C recommended standard PROV, aiming to capture the most relevant information concerning scientific workflow computational processes, and providing extension points to accommodate the specificities of particular scientific workflow systems.
This document specifies the ProvONE model and details how its constituting parts are related to the W3C PROV standard. The description provided is complemented by examples including queries on ProvONE data.
Version 1 Draft: This specification is in review and is publicly released for evaluation and possible adoption. However, it is not associated with and is not supported by any standards organization.
This specification was developed by the DataONE Cyberinfrastructure Working Group. If you wish to make comments regarding this document, please send them to developers@dataone.org. All comments are welcome.
The source for this specification and for the OWL document that implements the specification are maintained in our sem-prov-ontologies GitHub repository. We all welcome submission of a pull requests via our sem-prov-ontologies GitHub repository if you wish to propose specific changes to the specification or the OWL model.
Historically, one of the main uses of provenance has been to support claims of attribution and authenticity, and therefore of value for material objects (e.g. works of art, manuscripts, etc.). In science, provenance is required to provide evidence in support of the experimental results that underpin scientific publications. The importance of provenance still applies in e-Science settings, where the data is obtained through computational methods. In these cases, the provenance of the experimental outcome is typically a graph structured account of the individual computational steps, which is recorded automatically, at the level of detail specified by the system instrumentation. This form of provenance, suitably encoded for machine processing, can then be exploited using a variety of graph query and analysis tools.
This scenario, where each piece of scientific data obtained by a computational method is associated with its provenance, is becoming increasingly prevalent. Regarding scientific workflows, detailed execution traces are routinely collected by a number of broadly used Workflow Management Systems (WfMSs) including Taverna, Kepler, VisTrails, Galaxy, e-Science Central, Pegasus, and others. However, these systems often adopt proprietary models for encoding the provenance traces captured by workflow executions. Moreover, they adopt different models to specify the workflows themselves. Such heterogeneity makes it difficult for a scientist to analyze and compare provenance traces captured using the same or similar workflows that were specified and enacted using different systems. The absence of a standard model for representing workflow provenance also means that opportunities for stitching the traces produced by different workflows, and therefore assisting the scientist in her analysis, are likely to be missed.
This document presents ProvONE, a model for scientific workflow provenance that aims to fulfill the requirements of the desired standard. The name originates from its development in the context of the DataONE Project, which is creating a large scale and federated data infrastructure serving the earth sciences community. Nevertheless, ProvONE is designed to support a large variety of WfMSs that in turn are used by numerous scientific communities.
The provenance community has made significant efforts in developing standard models that can be used for capturing and publishing provenance of artifacts and resources on the Web. These efforts resulted in, first, the Open Provenance Model (OPM) [MCF+11], and more recently, the W3C PROV model [PROV]. While such models are useful and are being adopted by academics and industrials alike, as suggested by the number of PROV implementations, they do not suffice for encoding scientific workflow provenance. The reason being, that both OPM and PROV were developed as minimal models meant to be used for tracking the provenance of resources on the Web regardless of their types. As such, they do not provide all the concepts that are necessary for specifying workflows and encoding the provenance of data products used and generated as a result of their execution. Consequently, many WfMSs adopt their own provenance models, resulting in the aforementioned loss of interoperability opportunities.
Thus the need arises for a new model that acts as a standard for encoding scientific workflow provenance. Instead of creating such a model from scratch, the W3C PROV model can be used as a starting point. A preliminary proposal following this direction was published in [MDB+13]; an independent extension of PROV for scientific workflows is also presented in [CSdO+13], as well as in [BKG+13] (focusing on workflow preservation). This document aims to incorporate and standardize the ideas of these works, as well as additional contributions, to derive an adequate standard that can be used by the scientific workflow community.
ProvONE aims to provide the fundamental information required to understand and analyze scientific workflow-based computational experiments. Therefore, it covers the main aspects that have been identified as relevant in the provenance literature. These correspond to prospective and retrospective provenance [ZWF06] as well as process provenance [FSC+06]; additionally, some essential elements of data structure are also considered. Each of these aspects is described next.
Section 2 provides an overview of the ProvONE conceptual model, covering the aspects outlined in Section 1.2. The conceptual model of ProvONE is given using the Unified Modeling Language [UML].
Section 3 provides a detailed characterization of the various components of ProvONE, which is serialized as an OWL 2 ontology. It clarifies how the ProvONE concepts are related to the W3C PROV concepts, accompanying the descriptions with examples.
Section 4 gives references to additional resources that form part of the ProvONE standard.
The following namespaces and prefixes are used throughout this document.
prefix | namespace IRI | definition |
prov | http://www.w3.org/ns/prov# | The PROV namespace [ PROVO] |
provone | http://purl.dataone.org/provone/2015/01/15/ontology# | The ProvONE namespace [ProvONE] |
xsd | http://www.w3.org/2000/10/XMLSchema# | XML Schema namespace [ XMLSCHEMA11-2] |
rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# | The RDF namespace [ RDF-CONCEPTS] |
rdfs | http://www.w3.org/2000/01/rdf-schema# | The RDFS namespace [ RDF-SCHEMA] |
owl | http://www.w3.org/2002/07/owl# | OWL 2 specification namespace [ OWL2] |
dcterms | http://purl.org/dc/terms/ | Dublin Core Metadata Elements namespace [ DC-RDF] |
bibo | http://purl.org/ontology/bibo | The Bibliographic Ontology namespace [ BIBO] |
wfms | http://www.wfms.org/registry.xsd | Placeholder example WfMS namespace |
: | http://example.com/ | Artificial namespace for examples |
This section introduces ProvONE informally through a UML class diagram representing its conceptual model and brief descriptions of each of the aspects covered by the model.
The ProvONE conceptual model is illustrated by the UML diagram of Figure 1. All classes have a correspondent PROV type denoted by a UML stereotype (e.g. «entity»), whereas this is the case for only a subset of the associations (e.g. «used»). Each of the aspects covered by ProvONE is briefly described next.
Workflow Representation. The various tasks that form part of a workflow are represented by the Program class. Programs can be either atomic or composite, the latter case specified through the hasSubProgram self association. A given program can be distinguished as a Workflow. Each Program may have a series of Ports that function as input or output ports. Ports from the various Programs are connected through Channels. Note that both input and output ports may be associated with multiple Channels, thus allowing workflow models in which a single output is copied and sent to multiple destinations, as well as in which tasks take inputs from different sources through a single input port.
In order to specify executable instances of a Workflow, default parameters can be defined for some of its constituent Programs. The default parameters are represented by Entities that will described shortly. A Controller class can be used to specify that the execution of a given Program is controlled by another Program, which allows for differing models of computation. For instance, in a synchronous dataflow model, a given Program may only start once the execution of a preceding Program terminates.
Trace Representation. The execution traces associated with a given Workflow are represented in ProvONE through the Execution class. Each Execution instance represents the execution of a particular Program (its Plan), which itself may be a Workflow, and may also be associated with a User responsible for the execution. For the execution of a Program, a series of input Entity items are read from the input Ports and are used to generate a series of output Entity items sent through the output Ports. These outputs may be Data, Visualization, or Document items, depending on the goals of the Workflow. Through the use of the Usage and Generation classes, whenever an Entity item is sent from an output Port to an input Port, this event is recorded through the hadEntity, hadInPort and hadOutPort properties between the Entity item and the associated Ports. In this manner, the graph structure that represents the provenance of the workflow results is generated.
Data Structure Representation. The various entities associated with workflow instances and traces are represented by the Data class, the Visualization class, or the Document class. The Data class is defined to be generic and represents data items of various types (e.g. XML, JSON, CSV files, etc.). Visualizations are a differentiated class intended to represent various visualization items often output from workflows (JPG, PNG, SVG, MP4, etc.). The Document class is a generic representation of a published or unpublished article or report that was created as a result of a given Execution of a Program or Workflow. In the ProvONE model, each Entity subclass instance is uniquely identifiable regardless of it sharing the same value as another Entity instance. Although specific data types are not covered directly in ProvONE, collections of Entity items are represented through the Collection class. A Collection may in turn represent a set, bag, list or another variant of a group of items.
Workflow Evolution Representation. The specific changes that are performed during the specification of a Workflow are not modeled directly in ProvONE, since these are expected to vary among different WfMSs. However, the different versions of a Workflow form a derivation tree that can be represented using PROV's wasDerivedFrom association, as is explained in the next section.
The ProvONE constructs are summarized in Table 2. The first column lists the aspects covered by ProvONE, serving to indicate the various constructs associated with each aspect. The second and third columns indicate the type of each construct as presented in the UML class diagram (class or association) and the construct name, respectively. The last column contains a link to each construct specification in Section 3.
This section presents the specification of the various components of the ProvONE model outlined in the previous section, covering them as presented in Figure 1 and Table 2. The specification takes the form of an OWL 2 [OWL2] ontology that extends the W3C PROV-O ontology [PROVO].
The namespace for all ProvONE terms is http://purl.dataone.org/provone/2015/01/15/ontology#.
The encoding of the ProvONE ontology can be found under this link: provone.owl
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#Program
has super-class is in domain of is in range ofThe following RDF fragment specifies a Program identified within the RDF document by program_1.
1 @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . 2 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . 3 @prefix owl: <http://www.w3.org/2002/07/owl#> . 4 @prefix dcterms: <http://purl.org/dc/terms/> . 5 @prefix prov: <http://www.w3.org/ns/prov#> . 6 @prefix provone: <http://purl.org/provone> . 7 @prefix wfms: <http://www.wfms.org/registry.xsd> . 8 @prefix : <http://example.com/> . 9 10 :program_1 11 12 a provone:Program; 13 dcterms:identifier "e1"^^xsd:string; 14 dcterms:title "CDMSBoxfill"^^xsd:string; 15 wfms:package "gov.llnl.uvcdat.cdms"^^xsd:string; 16 17 .Line 12 specifies class membership. In order to specify additional attributes for the Program, we employ first the Dublin Core Metadata Elements [DC-RDF]. In particular, we assign the string e1 as an identifier, thus stating an identifier explicitly, which is independent of the RDF node identifier. In addition, a descriptive title is given in line 14. Additional attributes associated with the specific WfMS in use, in this case a placeholder example, can also be specified in a similar fashion. In this case the software package responsible for the execution of the Program within the WfMS is specified in line 15.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#Port
has super-class is in domain of is in range ofThe following RDF fragment specifies a Port identified within the RDF document by p1_ip1.
1 :p1_ip1 2 3 a provone:Port; 4 dcterms:identifier "e1_ip1"^^xsd:string; 5 dcterms:title "input_vars"^^xsd:string; 6 wfms:signature "gov.llnl.uvcdat.cdms:CDMSVariable"^^xsd:string; 7 8 .Again we make use of the Dublin Core Metadata Elements as well as of elements specific to an example placeholder WfMS. The Port is given an identifier and a descriptive title in lines 4 and 5, respectively. A signature denoting the type of data that the Port consumes is defined in line 6.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#Channel
has super-class is in range ofThe following RDF fragment specifies a Channel identified within the RDF document by p1_p2Ch.
1 :p1_p2Ch 2 3 a provone:Channel; 4 dcterms:identifier "e1_e2Ch"^^xsd:string; 5 6 .The Channel is simply specified as such (line 3) given the string e1_e2Ch as an identifier (line 4).
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#Controller
has super-class is in domain of is in range ofThe following RDF fragment specifies a Controller identified within the RDF document by p1_p2CL.
1 :p1_p2CL 2 3 a provone:Controller; 4 dcterms:identifier "e1_e2CL"^^xsd:string; 5 6 .The Controller is given the string e1_e2CL as an identifier (line 4).
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#Workflow
has super-class is in domain ofThe following RDF fragment specifies a Workflow identified within the RDF document by workflow_1.
1 :workflow_1 2 3 a provone:Workflow; 4 dcterms:identifier "wf1"^^xsd:string; 5 dcterms:title "ModelComparison"^^xsd:string; 6 7 .The Workflow is given the string wf1 as an identifier in line 4. In addition, it is given the string ModelComparison as a descriptive title in line 5.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#hasSubProgram
has domain has rangeThe following RDF fragment illustrates the use of the hasSubProgram object property by extending Example 1 of the Program class.
1 :top_program 2 3 a provone:Program; 4 dcterms:identifier "main"^^xsd:string; 5 6 . 7 8 :top_program provone:hasSubProgram :program_1 .A Program identified within the document as top_program, is given the identifier value main. Subsequently, in line 26 it is specified that the same Program top_program has as a sub-program the Program program_1, defined in Example 1.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#controlledBy
has domain has rangeThe following RDF fragment illustrates the use of the controlledBy object property by complementing Example 1 of the Program class and Example 5 of the Controller class.
1 2 :program_1 provone:controlledBy :p1_p2CL . 3Line 2 specifies that the Program program_1, defined in Example 1, is the source Program of the Controller p1_p2CL, defined in Example 5.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#controls
has domain has rangeThe following RDF fragment illustrates the use of the controls object property by complementing Example 5 of the Controller class.
1 2 :program_2 3 4 a provone:Program; 5 dcterms:identifier "e2"^^xsd:string; 6 dcterms:title "TemporalStatistics"^^xsd:string; 7 . 8 9 :p1_p2CL provone:controls :program_2 . 10A Program identified within the document as program_2, is given the identifier value e2 and TemporalStatistics as its title. Line 9 states that the Controller p1_p2CL, defined in Example 5, has as its destination the Program program_2.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#hasInPort
has domain has rangeThe following RDF fragment illustrates the use of the hasInPort object property by complementing Example 1 of the Program class and Example 2 of the Port class.
1 2 :program_1 provone:hasInPort :p1_ip1 . 3Line 2 specifies that the Program program_1, defined in Example 1, has an input Port p1_ip1, defined in Example 2.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#hasOutPort
has domain has rangeThe following RDF fragment illustrates the use of the hasOutPort object property by complementing Example 1 of the Program class and the Port class.
1 2 :program_1 provone:hasOutPort :p1_op1 . 3 4 :p1_op1 a provone:Port .Line 2 specifies that the Program program_1, defined in Example 1, has a Port p1_op1.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#hasDefaultParam
has domain has rangeThe following RDF fragment illustrates the use of the hasDefaultParam object property by complementing Example 2 of the Port class and Example 24 of the Data class.
1 2 :p1_ip1 provone:hasDefaultParam :data1 . 3Line 2 specifies that the Port p1_ip1, defined in Example 2 and associated in Example 11 with Program program_1 of Example 1, has as a default parameter the Data item data1, defined in Example 40.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#connectsTo
has domain has rangeThe following RDF fragment illustrates the use of the connectsTo object property.
1 :pmain_ip1 2 a provone:Port; 3 dcterms:identifier "e1_ip1"^^xsd:string; 4 . 5 6 :ch1 7 a provone:Channel; 8 dcterms:identifier "pmain_ch1"^^xsd:string; 9 . 10 11 :pmain_ip1 provone:connectsTo :ch1 .First, an input Port pmain_ip1is defined in lines 1-4. Lines 6-9 specify the Channel ch1, which is then is used to associate the input Port and the Channel using the connectsTo statement in line 11.
prov:wasDerivedFrom is adopted in ProvONE, in relation to workflow structure, to describe the evolution of programs and workflows.
IRI:http://www.w3.org/ns/prov#wasDerivedFrom
has domain has rangeThe following RDF fragment illustrates the use of the wasDerivedFrom object property by extending Example 6 of the Workflow class.
1 2 :workflow_1update1 3 4 a provone:Workflow; 5 dcterms:identifier "wf1upd1"^^xsd:string; 6 dcterms:title "ModelComparison"^^xsd:string; 7 8 :workflow_1update1 prov:wasDerivedFrom :workflow_1 . 9First, a Workflow workflow_1update1 is defined beginning at line 2. Line 8 specifies that Workflow workflow_1update1 was derived from Workflow workflow_1 of Example 6, which implies that it is a new version and the result of workflow evolution. Hence it is given the same title.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#Execution
has super-class is in domain ofThe following RDF fragment specifies a Execution identified within the RDF document by program_1ex1.
1 :program_1ex1 2 3 a provone:Execution; 4 dcterms:identifier "e1_ex1"^^xsd:string; 5 prov:startTime "2013-08-21 13:37:53"^^xsd:string; 6 prov:endTime "2013-08-21 13:37:53"^^xsd:string; 7 wfms:cached "0"^^xsd:integer; 8 wfms:completed "1"^^xsd:integer; 9 10 .An Execution is created with the string e1_ex1 as an identifier. In addition, timestamps denoting the moment in time at which the execution begins, and then is completed, are specified through the prov:startedAtTime and prov:endedAtTime data properties, respectively. In addition, the value 0 in line 7 indicates that the result was not obtained from a cache, while the 1 value in line 8 indicates that the execution was completed successfully.
IRI:http://www.w3.org/ns/prov-o#Association
has super-class is in domain of is in range ofThe following RDF fragment specifies a Association identified within the RDF document by association_1.
1 :association_1 2 3 a prov:Association; 4 prov:hadPlan :program_1; 5 .First, an Association association_1 is defined beginning at line 1. Line 4 specifies that Association association_1 had Program program_1 defined in Example 1 as a plan.
IRI:http://www.w3.org/ns/prov-o#Usage
has super-class is in domain of is in range ofThe following RDF fragment specifies a Usage identified within the RDF document by usage_1.
1 :usage_1 2 3 a prov:Usage; 4 prov:used dataSetA; 5 provone:hadEntity dataSetA; 6 . 7 :dataSetA a prov:Entity .A Usage usage_1 specifies that an Entity dataSetA is used at line 4.
IRI:http://www.w3.org/ns/prov-o#Generation
has super-class is in domain of is in range ofThe following RDF fragment specifies a Generation identified within the RDF document by generation_1.
1 :generation_1 2 3 a prov:Generation; 4 prov:wasGeneratedBy :program_1ex1; 5 provone:hadEntity :dataSetB; 6 . 7 :dataSetB a prov:Entity .A Generation generation_1 is simply defined beginning at line 1. generation_1 was generated by Execution program_1ex1 defined in Example 20 at line 4 and had an EntitydataSetB at line 5.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#User
has super-class is in range ofThe following RDF fragment specifies a User identified within the RDF document by user_1.
1 :user_1 2 3 a prov:Agent; 4 dcterms:identifier "user_eg_1"^^xsd:string; 5 .A User user_1 is created and given the string user_eg_1 as an identifier.
prov:used is adopted in ProvONE to state that an Execution made use of a particular Entity item as input for its execution.
IRI:http://www.w3.org/ns/prov#used
has domain has rangeThe following RDF fragment illustrates the use of the used object property by complementing Example 8 of the Execution class and Example 40 of the Data class.
1 2 :program_1ex1 prov:used :data1 . 3Line 2 specifies that Execution program_1ex1 of Example 20 used as an input Data item data1 of Example 26.
prov:wasGeneratedBy is adopted in ProvONE to state that an Execution produced a particular Entity item as output with its execution.
IRI:http://www.w3.org/ns/prov#wasGeneratedBy
has domain has rangeThe following RDF fragment illustrates the use of the wasGeneratedBy object property by complementing Example 20 of the Execution class.
1 2 :data2 3 4 a provone:Data; 5 dcterms:identifier "cdms1"^^xsd:string; 6 rdfs:label "cdms_data"^^xsd:string; 7 wfms:type "gov.llnl.uvcdat.cdms:CDMSVariable"^^xsd:string; 8 9 :data2 prov:wasGeneratedBy :program_1_ex1 . 10First, a Data item data2 is defined beginning at line 2. Line 9 specifies that the Data item data2 was produced as an output of Execution program_1ex1 of Example 20.
prov:wasAssociatedWith is adopted in ProvONE to state that a User was associated with a particular Execution. This serves as an assignment of attribution and responsibility.
IRI:http://www.w3.org/ns/prov#wasAssociatedWith
has domain has rangeThe following RDF fragment illustrates the use of the wasAssociatedWith object property by complementing Example 20 of the Execution class.
1 :program_1ex1 prov:wasAssociatedWith :user_1 .Line 1 specifies that the User user_1 was associated with Execution program_1ex1 of Example 20.
prov:wasInformedBy is adopted in ProvONE to state that an Execution communicates with another Execution through an output-input relation, and thereby triggers its execution.
IRI:http://www.w3.org/ns/prov#wasInformedBy
has domain has rangeThe following RDF fragment illustrates the use of the wasInformedBy object property by complementing Example 8 of the Execution class.
1 2 :program_2ex1 3 4 a provone:Execution; 5 dcterms:identifier "e2_ex1"^^xsd:string; 6 prov:startTime "2013-08-21 13:37:54"^^xsd:string; 7 prov:endTime "2013-08-21 13:37:54"^^xsd:string; 8 wfms:cached "0"^^xsd:integer; 9 wfms:completed "1"^^xsd:integer; 10 11 . 12 13 :program_2ex1 prov:wasInformedBy :program_1ex1 . 14First, an Execution program_2ex1 is defined beginning at line 2. Line 13 specifies that Execution program_2ex1 defined previously received data from Execution program_1ex1 of Example 20.
wasPartOf nables the specification of the structure of Execution instances in that a parent Execution (associated with a Workflow) has child Executions (associated with Programs and subworkflows).
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#wasPartOf
has domain has rangeThe following RDF fragment illustrates the use of the wasPartOf object property by complementing Example 6 of the Workflow class and Example 20 of the Execution class.
1 2 :workflow_1ex1 3 4 a provone:Execution; 5 dcterms:identifier "wf1_ex1"^^xsd:string; 6 prov:startTime "2013-08-21 13:37:54"^^xsd:string; 7 prov:endTime "2013-08-21 13:37:59"^^xsd:string; 8 wfms:completed "1"^^xsd:integer; 9 10 . 11 12 :workflow_1ex1 prov:wasAssociatedWith :workflow_1 . 13 14 :program_1ex1 provone:wasPartOf :workflow_1ex1 . 15First, an Execution workflow_1ex1 is defined beginning at line 2 and associated with Workflow workflow_1 of Example 6 in line 12. Line 14 specifies that Execution program_1ex1 of Example 20 is part of Execution workflow_1ex1.
IRI:http://www.w3.org/ns/prov-o#qualifiedAssociation
has domain has rangeThe following RDF fragment illustrates the use of the qualifiedAssociation object property by complementing Example 6 of the Workflow class and Example 20 of the Execution class.
1 :program_1ex1 2 prov:qualifiedAssociation [ 3 a prov:Association; 4 prov:agent :user_1; 5 prov:hadPlan :program_1; 6 rdfs:comment "user_1 created this association."; 7 ] 8 .This example is complementary to the Execution program_1ex1 defined in Example 20. Line 2 specifies that program_1ex1 has a qualified association with a plan Program program_1, defined in Example 1. Then, program_1ex1 is assocated with a User user_1.
IRI:http://www.w3.org/ns/prov-o#p_agent
has domain has rangeThe following RDF fragment illustrates the use of the agent object property by complementing Example 21 of the Association class.
1 :association_1 prov:agent foo_University . 2 3 foo_University a prov:Organization, prov:Agent .The Association asociation_1 is defined to have an agent foo_University which is an Organization.
IRI:http://www.w3.org/ns/prov-o#hadPlan
has domain has rangeThe following RDF fragment illustrates the use of the hadPlan object property by complementing Example 21 of the Association class and Example 6 of the Workflow class.
1 :association_1 provone:hadPlan :program_1 .Line 1 specifies that the Association association_1 has Program program_1 as a plan.
IRI:http://www.w3.org/ns/prov-o#qualifiedUsage
has domain has rangeIRI:http://www.w3.org/ns/prov-o#qualifiedGeneration
has domain has rangeThe following RDF fragment illustrates the use of the qualifiedUsage object property by complementing Example 20 of the Execution class.
1 :program_1ex1 2 prov:qualifiedUsage [ 3 a prov:Usage; 4 prov:entity :dataSetA; 5 ] 6 .Line 2 defines a qualified usage with an Entity dataSetA for the Execution program_1ex1 in Example 20.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#hadInPort
has domain has rangeThe following RDF fragment illustrates the use of the hadInPort object property by complementing Example 22 of the Usage class.
1 :usage_1 provone:hadInPort :p1_ip1 .Line 2 specifies that the Usage usage_1, defined in Example 22, has an input Port p1_ip1, defined in Example 2.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#hadEntity
has domain has rangeThe following RDF fragment illustrates the use of the hadOutPort object property by complementing Example 23 of the Generation class and Example 22 of the Usage class..
1 :generation_1 provone:hadEntity :dataSetC . 2 :usage_1 provone:hadEntity :dataSetD . 3 4 :dataSetC a prov:Entity . 5 :dataSetD a prov:Entity .Line 1 specifies the Generation generation_1 has an Entity dataSetC. Line 2 specifies the Usage usage_1 has an entity dataSetD.
IRI:http://www.w3.org/ns/prov-o#qualifiedGeneration
has domain has rangeThe following RDF fragment illustrates the use of the qualifiedGeneration object property by complementing Example 20 of the Execution class.
1 :program_1ex1 2 prov:qualifiedGeneration [ 3 a prov:Generation; 4 prov:atTime "2013-08-21 13:37:53"^^xsd:string; 5 ] 6 .Line 2 defines that the Execution program_1ex1 in Example 20 has a qualified generation with a timestamp "2013-08-21 13:37:53".
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#hadOutPort
has domain has rangeThe following RDF fragment illustrates the use of the hadOutPort object property by complementing Example 23 of the Generation class.
1 :generation_1 provone:hadOutPort :p1_op1 .Line 1 specifies the Generation generation_1 has a Port p1_op1 defined in Example 12 as an output.
IRI:http://www.w3.org/ns/prov-o#Entity
is in domain of is in range ofInstead of specifying a new class or subclass, we adopt explicitly as part of the ProvONE model PROV's prov:Collection class, whose description we cite below.
IRI:http://www.w3.org/ns/prov#Collection
The following RDF fragment specifies a Collection identified within the RDF document by col1.
1 :col1 2 3 a prov:Collection; 4 dcterms:identifier "inputset1"^^xsd:string; 5 6 . 7A Collection is created with the string inputset1 as an identifier.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#Data
has super-classThe following RDF fragment specifies a Data item identified within the RDF document by data1.
1 :data1 2 3 a provone:Data; 4 dcterms:identifier "defparam1"^^xsd:string; 5 rdfs:label "filename"^^xsd:string; 6 prov:value "DLEM_NEE_onedeg_v1.0nc"^^xsd:string; 7 wfms:type "edu.sci.wfms.basic:File"^^xsd:string; 8 9 . 10A Data item is created with the string defparam1 as an identifier. It is also given the descriptive string filename through the rdfs:label data property. The prov:value data property specifies the actual value of the data item, namely DLEM_NEE_onedeg_v1.0nc. Finally, the type of the data item as defined by the WfMS is specified in line 7 to be edu.sci.wfms.basic:File.
IRI:http://purl.dataone.org/provone/2015/01/15/ontology#Visualization
has super-classIRI:http://purl.dataone.org/provone/2015/01/15/ontology#Document
has super-classprov:wasDerivedFrom is adopted in ProvONE, in relation to data structure, to describe dependencies between the Data items produced during workflow execution.
IRI:http://www.w3.org/ns/prov#wasDerivedFrom
has domain has rangeThe following RDF fragment illustrates the use of the wasDerivedFrom object property by extending Example 40 of the Data class.
1 2 :data2 3 4 a provone:Data; 5 dcterms:identifier "defparam1"^^xsd:string; 6 rdfs:label "filename"^^xsd:string; 7 prov:value "DLEM_NEE_onedeg_v1.0nc"^^xsd:string; 8 wfms:type "edu.sci.wfms.basic:File"^^xsd:string; 9 10 . 11 12 :data2 prov:wasDerivedFrom :data1 . 13First, a Data item data2 is defined beginning at line 2. Line 12 specifies that the Data item data2 was produced from Data item data1 of Example 40.
prov:hadMember is adopted in ProvONE, in relation to data structure, to specify the Data items that form part of a Collection.
IRI:http://www.w3.org/ns/prov#hadMember
has domain has rangeThe following RDF fragment illustrates the use of the hadMember object property by extending Example 39 of the Collection class.
1 2 :infile1 3 4 a provone:Data; 5 dcterms:identifier "data_file1"^^xsd:string; 6 rdfs:label "file1"^^xsd:string; 7 prov:value "file1.dat"^^xsd:string; 8 wfms:type "edu.sci.wfms.basic:File"^^xsd:string; 9 10 . 11 12 :infile2 13 14 a provone:Data; 15 dcterms:identifier "data_file2"^^xsd:string; 16 rdfs:label "file2"^^xsd:string; 17 prov:value "file2.dat"^^xsd:string; 18 wfms:type "edu.sci.wfms.basic:File"^^xsd:string; 19 20 . 21 22 :col1 prov:hadMember :infile1 . 23 24 :col1 prov:hadMember :infile2 . 25Two Data items infile1 and infile2 are defined in lines 2-20. Line 22 specifies that Data item infile1 was a member of Collection item col1 of Example 39. Analogously, line 24 specifies Data item infile2 also as part of Collection item col1.