Tags

Academic
What exactly happened to LSID? What exactly happened to LSID? It was a technically sound approach it would seem and one whose failure we would do well to learn more from.
s11 Citation & Bibliography Style Guide The s11 House Rules recommend this opinionated bibliography style for academic writing.
Adoptopenjdk
Installing ProvToolbox on macOS

ProvToolbox is a useful command line tool for validating and visualizing PROV documents, but unfortunately it can be a bit of a challenge to install on Windows and on macOS because of its dependency requirements.

This post suggests three step-by-step methods of installing ProvToolbox on your Mac – you should follow the method you feel most comfortable with, but can try the other methods in case of problems.

Table of content

  1. Overview of requirements
    1. Software packaging for macOS
  2. Conda
    1. Installing Graphviz and OpenJDK with Conda
  3. HomeBrew
    1. Installing Graphviz with HomeBrew
    2. Installing OpenJDK with HomeBrew
  4. Installing manually
    1. Installing AdoptOpenJDK manually
    2. Installing Graphviz manually
  5. Installing ProvToolbox
    1. Using ProvToolbox from VSCode

Overview of requirements

As of 2020-08, ProvToolbox 0.9.5 is the latest release, which requires:

Installing ProvToolbox in Windows

While there are several tools available for validating and visualizing PROV, the ProvToolbox is perhaps the most useful for validating PROV-N syntax. However, the normal releases does not run in Windows due to a operating system restriction for command line and folder path length.

We have suggested a fix, but while we wait for that, here we describe a patch build that should work on Windows. We also show how to install dependencies: Java for executing ProvToolbox, and Graphviz for visualization. (See also macOS install).

Attribution
Attribution vs association

A valid question when writing provenance in responsibility view and process view is. Should we attribute contributors from entities, isn’t that what the activities are showing? In this blog post we explore the different options.

Specially with roles it may seem unnecessary to also declare wasAttributedTo statements.

It is true that you can conclude from:

wasAttributed(ex:entity, ex:agent)

then there was some activity X such that:

wasGeneratedBy(ex:entity, X)  
wasAssociatedWith(X, ex:entity)

This conclusion follows from the constraint on agents and the definition of wasAttributedTo.

Bagit
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv Journal article published in GigaScience
Bioschemas
BioHackEU23 report: Enabling FAIR Digital Objects with RO-Crate, Signposting and Bioschemas BioHackrXiv preprint from ELIXIR BioHackathon 2023
Citation
s11 Citation & Bibliography Style Guide The s11 House Rules recommend this opinionated bibliography style for academic writing.
Collections
Multiple agents sharing roles

Assuming the task of writing provenance for a student group exercise, consider the question:

Do we need to assign everyone in the group a specific role since in our group we found that for many of the tasks, everyone worked together to complete it?

MSc Student in Understanding Data and their Environment, University of Manchester, 2020

This blog post explores the different PROV patterns that could describe this scenario.

Command line
Validating and visualising PROV

This blog post gives a gentle PROV-N introdction and then explores tools for validating and visualising PROV.

One of the advantages of W3C PROV having a common data model is that it can be serialized, or written out, in multiple file formats. The PROV family of W3C specifications describe mappings PROV-XML and PROV-O (which, being based on OWL2 itself has multiple serializations, for Linked Data including RDF formats Turtle and JSON-LD.

In addition to these standard approaches we also have PROV-JSON and PROV-JSONLD which could be well-suited for Web applications. All of these can in theory be mapped to each-other through the common PROV Data Model and the use of URIs as Linked Data global identifiers.

Common workflow language
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv Journal article published in GigaScience
Computational workflow
FAIR Computational workflows Journal article published in Data Intelligence
Conda
Installing ProvToolbox on macOS

ProvToolbox is a useful command line tool for validating and visualizing PROV documents, but unfortunately it can be a bit of a challenge to install on Windows and on macOS because of its dependency requirements.

This post suggests three step-by-step methods of installing ProvToolbox on your Mac – you should follow the method you feel most comfortable with, but can try the other methods in case of problems.

Table of content

  1. Overview of requirements
    1. Software packaging for macOS
  2. Conda
    1. Installing Graphviz and OpenJDK with Conda
  3. HomeBrew
    1. Installing Graphviz with HomeBrew
    2. Installing OpenJDK with HomeBrew
  4. Installing manually
    1. Installing AdoptOpenJDK manually
    2. Installing Graphviz manually
  5. Installing ProvToolbox
    1. Using ProvToolbox from VSCode

Overview of requirements

As of 2020-08, ProvToolbox 0.9.5 is the latest release, which requires:

Containers
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv Journal article published in GigaScience
Cwl
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv Journal article published in GigaScience
Data management plan
Enhancing Research Data Management in Galaxy and Data Stewardship Wizard by utilising RO-Crates BioHackrXiv preprint from ELIXIR BioHackathon 2022
Dmp
Enhancing Research Data Management in Galaxy and Data Stewardship Wizard by utilising RO-Crates BioHackrXiv preprint from ELIXIR BioHackathon 2022
Fair
Applying the FAIR Principles to Computational Workflows Journal article published in Scientific Data
FAIR Computational workflows Journal article published in Data Intelligence
WorkflowHub: a registry for computational workflows arXiv preprint
Fair digital object
BioHackEU23 report: Enabling FAIR Digital Objects with RO-Crate, Signposting and Bioschemas BioHackrXiv preprint from ELIXIR BioHackathon 2023
Faq
Attribution vs association

A valid question when writing provenance in responsibility view and process view is. Should we attribute contributors from entities, isn’t that what the activities are showing? In this blog post we explore the different options.

Specially with roles it may seem unnecessary to also declare wasAttributedTo statements.

It is true that you can conclude from:

wasAttributed(ex:entity, ex:agent)

then there was some activity X such that:

wasGeneratedBy(ex:entity, X)  
wasAssociatedWith(X, ex:entity)

This conclusion follows from the constraint on agents and the definition of wasAttributedTo.

Multiple agents sharing roles

Assuming the task of writing provenance for a student group exercise, consider the question:

Do we need to assign everyone in the group a specific role since in our group we found that for many of the tasks, everyone worked together to complete it?

MSc Student in Understanding Data and their Environment, University of Manchester, 2020

This blog post explores the different PROV patterns that could describe this scenario.

Resources that change state

The PROV working group received a question from Mike:

My understanding is that an entity referenced in a PROV bundle (e.g. via wasGeneratedBy) must be in the bundle…but I do not wish to duplicate entity definitions through out my bundles. My entities are long lived and will exist in multiple bundles.
So lets say I have a resource for alarms which contains a list of all alarms my company monitors. If I turn off the alarm at alarm/1, my understanding is that in PROV a new entity is created for the new state of alarm/1.
But in my actual data store, I don’t create a new record, I just toggle a flag. So there is a disconnect between how my PROV looks and how my data looks. This is by design is my understanding.
So I would have a new entity in my prov for the alarm/1 in the new state which is a specialization of alarm/1, yes? Ultimately, I want to display all of the provenance for alarm/1 so I can see its history from creation to invalidation. Am I going about this the wrong way?

Galaxy
Enhancing Research Data Management in Galaxy and Data Stewardship Wizard by utilising RO-Crates BioHackrXiv preprint from ELIXIR BioHackathon 2022
Graphviz
Installing ProvToolbox on macOS

ProvToolbox is a useful command line tool for validating and visualizing PROV documents, but unfortunately it can be a bit of a challenge to install on Windows and on macOS because of its dependency requirements.

This post suggests three step-by-step methods of installing ProvToolbox on your Mac – you should follow the method you feel most comfortable with, but can try the other methods in case of problems.

Table of content

  1. Overview of requirements
    1. Software packaging for macOS
  2. Conda
    1. Installing Graphviz and OpenJDK with Conda
  3. HomeBrew
    1. Installing Graphviz with HomeBrew
    2. Installing OpenJDK with HomeBrew
  4. Installing manually
    1. Installing AdoptOpenJDK manually
    2. Installing Graphviz manually
  5. Installing ProvToolbox
    1. Using ProvToolbox from VSCode

Overview of requirements

As of 2020-08, ProvToolbox 0.9.5 is the latest release, which requires:

Groups
Multiple agents sharing roles

Assuming the task of writing provenance for a student group exercise, consider the question:

Do we need to assign everyone in the group a specific role since in our group we found that for many of the tasks, everyone worked together to complete it?

MSc Student in Understanding Data and their Environment, University of Manchester, 2020

This blog post explores the different PROV patterns that could describe this scenario.

Homebrew
Installing ProvToolbox on macOS

ProvToolbox is a useful command line tool for validating and visualizing PROV documents, but unfortunately it can be a bit of a challenge to install on Windows and on macOS because of its dependency requirements.

This post suggests three step-by-step methods of installing ProvToolbox on your Mac – you should follow the method you feel most comfortable with, but can try the other methods in case of problems.

Table of content

  1. Overview of requirements
    1. Software packaging for macOS
  2. Conda
    1. Installing Graphviz and OpenJDK with Conda
  3. HomeBrew
    1. Installing Graphviz with HomeBrew
    2. Installing OpenJDK with HomeBrew
  4. Installing manually
    1. Installing AdoptOpenJDK manually
    2. Installing Graphviz manually
  5. Installing ProvToolbox
    1. Using ProvToolbox from VSCode

Overview of requirements

As of 2020-08, ProvToolbox 0.9.5 is the latest release, which requires:

Interoperability
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv Journal article published in GigaScience
Java
Installing ProvToolbox in Windows

While there are several tools available for validating and visualizing PROV, the ProvToolbox is perhaps the most useful for validating PROV-N syntax. However, the normal releases does not run in Windows due to a operating system restriction for command line and folder path length.

We have suggested a fix, but while we wait for that, here we describe a patch build that should work on Windows. We also show how to install dependencies: Java for executing ProvToolbox, and Graphviz for visualization. (See also macOS install).

Licenses
Choosing an open source license The s11 House Rules recommends these open source licenss for software and creative work.
Linked data
What exactly happened to LSID? What exactly happened to LSID? It was a technically sound approach it would seem and one whose failure we would do well to learn more from.
Macos
Installing ProvToolbox on macOS

ProvToolbox is a useful command line tool for validating and visualizing PROV documents, but unfortunately it can be a bit of a challenge to install on Windows and on macOS because of its dependency requirements.

This post suggests three step-by-step methods of installing ProvToolbox on your Mac – you should follow the method you feel most comfortable with, but can try the other methods in case of problems.

Table of content

  1. Overview of requirements
    1. Software packaging for macOS
  2. Conda
    1. Installing Graphviz and OpenJDK with Conda
  3. HomeBrew
    1. Installing Graphviz with HomeBrew
    2. Installing OpenJDK with HomeBrew
  4. Installing manually
    1. Installing AdoptOpenJDK manually
    2. Installing Graphviz manually
  5. Installing ProvToolbox
    1. Using ProvToolbox from VSCode

Overview of requirements

As of 2020-08, ProvToolbox 0.9.5 is the latest release, which requires:

Namespace
What are good PROV-N prefixes?

In this blog post we explore the role of PROV-N prefixes and how to decide on a good namespace to use your own custom provenance terms.

Most examples of PROV-N use example prefixes like:

prefix ex <http://example.com/>
prefix exg <http://example.org/government>

These example domains are explicitly reserved globally for all kinds of examples and training material, and deliberately do not have any content, advertisement or affiliations.

Assume you are writing the provenance of a student group exercise, should you be using the prefix/namespace ex and example.org to define agents/entities/relationship and your own attribute types?

Ontology
PAV Ontology paper highly accessed

pav-paper-frontpage

Our recent paper about the PAV ontology has been classified as highly accessed by Journal of Biomedical Semantics, with more than 1097 views since it was published two months ago, with an Altmetric score of 12.

The PAV ontology provides a lightweight approach to record typical Provenance, Authorship and Versioning information, and builds upon existing standards like PROV-O and DC Terms. Our previous Practical Provenance post gives a brief overview of PAV, but you might also want to explore these links for more details:

Open source
Choosing an open source license The s11 House Rules recommends these open source licenss for software and creative work.
Paper
PAV Ontology paper highly accessed

pav-paper-frontpage

Our recent paper about the PAV ontology has been classified as highly accessed by Journal of Biomedical Semantics, with more than 1097 views since it was published two months ago, with an Altmetric score of 12.

The PAV ontology provides a lightweight approach to record typical Provenance, Authorship and Versioning information, and builds upon existing standards like PROV-O and DC Terms. Our previous Practical Provenance post gives a brief overview of PAV, but you might also want to explore these links for more details:

Pav
Tracking versions with PAV

The PAV ontology specializes the W3C PROV-O standard to give a lightweight approach to recording details about a resource, giving its Provenance, Authorship and Versioning. Our paper on PAV explores all of these aspects in details. In this blog post we discuss Versioning as modelled by PAV, including their hierarchical organization.

Versioning is commonly used for software releases (e.g. Windows 8.1, Firefox 26, Python 3.3.2), but increasingly also for datasets and documents. For the purpose of provenance, a version number allows the declaration of the current state of a resource, which can be cross-checked against release notes and used for references, for instance to indicate which particular version of a dataset was used in producing an analysis report.

PAV Ontology paper highly accessed

pav-paper-frontpage

Our recent paper about the PAV ontology has been classified as highly accessed by Journal of Biomedical Semantics, with more than 1097 views since it was published two months ago, with an Altmetric score of 12.

The PAV ontology provides a lightweight approach to record typical Provenance, Authorship and Versioning information, and builds upon existing standards like PROV-O and DC Terms. Our previous Practical Provenance post gives a brief overview of PAV, but you might also want to explore these links for more details:

Recording authorship, curation and digital creation with the PAV ontology PAV is a lightweight ontology for tracking Provenance, Authoring and Versioning.  PAV supplies terms for distinguishing between the different roles of the agents contributing content in current web based systems: contributors, authors, curators and digital artifact creators. The ontology also provides terms for tracking provenance of digital entities that are published on the web and then accessed, transformed and consumed.
Prefix
What are good PROV-N prefixes?

In this blog post we explore the role of PROV-N prefixes and how to decide on a good namespace to use your own custom provenance terms.

Most examples of PROV-N use example prefixes like:

prefix ex <http://example.com/>
prefix exg <http://example.org/government>

These example domains are explicitly reserved globally for all kinds of examples and training material, and deliberately do not have any content, advertisement or affiliations.

Assume you are writing the provenance of a student group exercise, should you be using the prefix/namespace ex and example.org to define agents/entities/relationship and your own attribute types?

Preservation
Archive

In an alternate reality I am sure I would be a librarian.

With ongoing linkrot and degradation of old Internet content, in particular reference rot from academic literature, it can be hard to find digital content even from as recently as 10 years ago.

It follows that academics now should be their own library curator, to preserve their own outputs and reference materials.

In this section I catalogue some of the content and software I have rescued, archived or rediscovered.

Wf4Ever project Wf4Ever was a research object funded by EU Framework 7 to investigate how scientific workflows and their data could be better preserved for reproducibility, reuse and resiliance against workflow decay.
Prov
PROV-N Cheat Sheet This is a quick “cheat sheet” for the PROV-N syntax.
Installing ProvToolbox on macOS

ProvToolbox is a useful command line tool for validating and visualizing PROV documents, but unfortunately it can be a bit of a challenge to install on Windows and on macOS because of its dependency requirements.

This post suggests three step-by-step methods of installing ProvToolbox on your Mac – you should follow the method you feel most comfortable with, but can try the other methods in case of problems.

Table of content

  1. Overview of requirements
    1. Software packaging for macOS
  2. Conda
    1. Installing Graphviz and OpenJDK with Conda
  3. HomeBrew
    1. Installing Graphviz with HomeBrew
    2. Installing OpenJDK with HomeBrew
  4. Installing manually
    1. Installing AdoptOpenJDK manually
    2. Installing Graphviz manually
  5. Installing ProvToolbox
    1. Using ProvToolbox from VSCode

Overview of requirements

As of 2020-08, ProvToolbox 0.9.5 is the latest release, which requires:

Installing ProvToolbox in Windows

While there are several tools available for validating and visualizing PROV, the ProvToolbox is perhaps the most useful for validating PROV-N syntax. However, the normal releases does not run in Windows due to a operating system restriction for command line and folder path length.

We have suggested a fix, but while we wait for that, here we describe a patch build that should work on Windows. We also show how to install dependencies: Java for executing ProvToolbox, and Graphviz for visualization. (See also macOS install).

Attribution vs association

A valid question when writing provenance in responsibility view and process view is. Should we attribute contributors from entities, isn’t that what the activities are showing? In this blog post we explore the different options.

Specially with roles it may seem unnecessary to also declare wasAttributedTo statements.

It is true that you can conclude from:

wasAttributed(ex:entity, ex:agent)

then there was some activity X such that:

wasGeneratedBy(ex:entity, X)  
wasAssociatedWith(X, ex:entity)

This conclusion follows from the constraint on agents and the definition of wasAttributedTo.

Multiple agents sharing roles

Assuming the task of writing provenance for a student group exercise, consider the question:

Do we need to assign everyone in the group a specific role since in our group we found that for many of the tasks, everyone worked together to complete it?

MSc Student in Understanding Data and their Environment, University of Manchester, 2020

This blog post explores the different PROV patterns that could describe this scenario.

What are good PROV-N prefixes?

In this blog post we explore the role of PROV-N prefixes and how to decide on a good namespace to use your own custom provenance terms.

Most examples of PROV-N use example prefixes like:

prefix ex <http://example.com/>
prefix exg <http://example.org/government>

These example domains are explicitly reserved globally for all kinds of examples and training material, and deliberately do not have any content, advertisement or affiliations.

Assume you are writing the provenance of a student group exercise, should you be using the prefix/namespace ex and example.org to define agents/entities/relationship and your own attribute types?

Validating and visualising PROV

This blog post gives a gentle PROV-N introdction and then explores tools for validating and visualising PROV.

One of the advantages of W3C PROV having a common data model is that it can be serialized, or written out, in multiple file formats. The PROV family of W3C specifications describe mappings PROV-XML and PROV-O (which, being based on OWL2 itself has multiple serializations, for Linked Data including RDF formats Turtle and JSON-LD.

In addition to these standard approaches we also have PROV-JSON and PROV-JSONLD which could be well-suited for Web applications. All of these can in theory be mapped to each-other through the common PROV Data Model and the use of URIs as Linked Data global identifiers.

Resources that change state

The PROV working group received a question from Mike:

My understanding is that an entity referenced in a PROV bundle (e.g. via wasGeneratedBy) must be in the bundle…but I do not wish to duplicate entity definitions through out my bundles. My entities are long lived and will exist in multiple bundles.
So lets say I have a resource for alarms which contains a list of all alarms my company monitors. If I turn off the alarm at alarm/1, my understanding is that in PROV a new entity is created for the new state of alarm/1.
But in my actual data store, I don’t create a new record, I just toggle a flag. So there is a disconnect between how my PROV looks and how my data looks. This is by design is my understanding.
So I would have a new entity in my prov for the alarm/1 in the new state which is a specialization of alarm/1, yes? Ultimately, I want to display all of the provenance for alarm/1 so I can see its history from creation to invalidation. Am I going about this the wrong way?

PROV released as W3C Recommendations

The Provenance Working Group was chartered to develop a framework for interchanging provenance on the Web. The Working Group has now published the PROV Family of Documents as W3C Recommendations, along with corresponding supporting notes. You can find a complete list of the documents in the PROV Overview Note.
PROV enables one to represent and interchange provenance information using widely available formats such as RDF and XML. In addition, it provides definitions for accessing provenance information, validating it, and mapping to Dublin Core. Learn more about the Semantic Web.

Locating provenance for a RESTful web service

This blog post shows how RESTful web services can provide, and link to, provenance data for their exposed resources by using the PROV-AQ mechanism of HTTP Link headers. This is demonstrated by showing how to update a hello world REST service implemented with Java and JAX-RS 2.0 to provide these links.

The  PROV-AQ HTTP mechanism is easiest explained by an example:

GET http://example.com/resource.html HTTP/1.1
Accept: text/html

HTTP/1.1 200 OK
Content-type: text/html
Link: <http://example.com/resource-provenance>; 
         rel="http://www.w3.org/ns/prov#has_provenance"; 
         anchor="http://example.com/resource"

<html>
  <!-- ... -->
</html>

This request for http://example.com/resource.html returns some HTML, but also provides a Link: header that says that the provenance is located at http://example.com/resource-provenance.

W3C PROV Implementations: Preliminary Analysis

By Khalid Belhajjame, syndicated from https://khalidbelhajjame.wordpress.com/2013/04/04/w3c-prov-implementations/

In the beginning of December 2012, the W3C Provenance Working Group issued a call for implementations. As of February the 25th 2013, 64 PROV implementations were reported to the W3C Provenance Working Group.

These implementations took different forms ranging from stand alone applications (30), to reusable frameworks and libraries (10), to services hosted by third parties (9), to vocabularies (21), and constraints validation modules (3).

Recording authorship, curation and digital creation with the PAV ontology PAV is a lightweight ontology for tracking Provenance, Authoring and Versioning.  PAV supplies terms for distinguishing between the different roles of the agents contributing content in current web based systems: contributors, authors, curators and digital artifact creators. The ontology also provides terms for tracking provenance of digital entities that are published on the web and then accessed, transformed and consumed.
Tutorial on the W3C PROV family of specifications

Posted by Khalid Belhajjame

Provenance, a form of structured metadata designed to record the origin or source of information, can be instrumental in deciding whether information is to be trusted, how it can be integrated with other diverse information sources, and how to establish attribution of information to authors throughout its history.

The PROV set of specifications, produced by the World Wide Web Consortium (W3C), is designed to promote the publication of provenance information on the Web, and offers a basis for interoperability across diverse provenance management systems. The PROV provenance model is deliberately generic and domain-agnostic, but extension mechanisms are available and can be exploited for modelling specific domains.

What can provenance do for me?

Also available on Slideshare, pdf and as pptx.

The above presentation was originally given at the Metagenomics, metagenetics and Pylogenetic workflows for Ocean Sampling Day Workshop at Max Planck Institute for Marine Microbiology on 2013-03-21 by Stian Soiland-Reyes. Reuse allowed under the Creative Commons Attribution license 3.0.

Prov n
PROV-N Cheat Sheet This is a quick “cheat sheet” for the PROV-N syntax.
What are good PROV-N prefixes?

In this blog post we explore the role of PROV-N prefixes and how to decide on a good namespace to use your own custom provenance terms.

Most examples of PROV-N use example prefixes like:

prefix ex <http://example.com/>
prefix exg <http://example.org/government>

These example domains are explicitly reserved globally for all kinds of examples and training material, and deliberately do not have any content, advertisement or affiliations.

Assume you are writing the provenance of a student group exercise, should you be using the prefix/namespace ex and example.org to define agents/entities/relationship and your own attribute types?

Validating and visualising PROV

This blog post gives a gentle PROV-N introdction and then explores tools for validating and visualising PROV.

One of the advantages of W3C PROV having a common data model is that it can be serialized, or written out, in multiple file formats. The PROV family of W3C specifications describe mappings PROV-XML and PROV-O (which, being based on OWL2 itself has multiple serializations, for Linked Data including RDF formats Turtle and JSON-LD.

In addition to these standard approaches we also have PROV-JSON and PROV-JSONLD which could be well-suited for Web applications. All of these can in theory be mapped to each-other through the common PROV Data Model and the use of URIs as Linked Data global identifiers.

Provenance
Tracking workflow execution with TavernaProv

Apache Taverna is a scientific workflow system for combining web services and local tools. Taverna records provenance of workflow runs, intermediate values and user interactions, both as an aid for debugging while designing the workflow, but also as a record for later reproducibility and comparison.

Taverna also records provenance of the evolution of the workflow definition (including a chain of wasDerivedFrom relations), attributions and annotations; for brevity we here focus on how Taverna’s workflow run provenance extends PROV and is embedded with Research Objects.

Recording authorship, curation and digital creation with the PAV ontology PAV is a lightweight ontology for tracking Provenance, Authoring and Versioning.  PAV supplies terms for distinguishing between the different roles of the agents contributing content in current web based systems: contributors, authors, curators and digital artifact creators. The ontology also provides terms for tracking provenance of digital entities that are published on the web and then accessed, transformed and consumed.
FAIR Computational workflows Journal article published in Data Intelligence
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv Journal article published in GigaScience
Provtoolbox
Installing ProvToolbox on macOS

ProvToolbox is a useful command line tool for validating and visualizing PROV documents, but unfortunately it can be a bit of a challenge to install on Windows and on macOS because of its dependency requirements.

This post suggests three step-by-step methods of installing ProvToolbox on your Mac – you should follow the method you feel most comfortable with, but can try the other methods in case of problems.

Table of content

  1. Overview of requirements
    1. Software packaging for macOS
  2. Conda
    1. Installing Graphviz and OpenJDK with Conda
  3. HomeBrew
    1. Installing Graphviz with HomeBrew
    2. Installing OpenJDK with HomeBrew
  4. Installing manually
    1. Installing AdoptOpenJDK manually
    2. Installing Graphviz manually
  5. Installing ProvToolbox
    1. Using ProvToolbox from VSCode

Overview of requirements

As of 2020-08, ProvToolbox 0.9.5 is the latest release, which requires:

Installing ProvToolbox in Windows

While there are several tools available for validating and visualizing PROV, the ProvToolbox is perhaps the most useful for validating PROV-N syntax. However, the normal releases does not run in Windows due to a operating system restriction for command line and folder path length.

We have suggested a fix, but while we wait for that, here we describe a patch build that should work on Windows. We also show how to install dependencies: Java for executing ProvToolbox, and Graphviz for visualization. (See also macOS install).

Validating and visualising PROV

This blog post gives a gentle PROV-N introdction and then explores tools for validating and visualising PROV.

One of the advantages of W3C PROV having a common data model is that it can be serialized, or written out, in multiple file formats. The PROV family of W3C specifications describe mappings PROV-XML and PROV-O (which, being based on OWL2 itself has multiple serializations, for Linked Data including RDF formats Turtle and JSON-LD.

In addition to these standard approaches we also have PROV-JSON and PROV-JSONLD which could be well-suited for Web applications. All of these can in theory be mapped to each-other through the common PROV Data Model and the use of URIs as Linked Data global identifiers.

Locating provenance for a RESTful web service

This blog post shows how RESTful web services can provide, and link to, provenance data for their exposed resources by using the PROV-AQ mechanism of HTTP Link headers. This is demonstrated by showing how to update a hello world REST service implemented with Java and JAX-RS 2.0 to provide these links.

The  PROV-AQ HTTP mechanism is easiest explained by an example:

GET http://example.com/resource.html HTTP/1.1
Accept: text/html

HTTP/1.1 200 OK
Content-type: text/html
Link: <http://example.com/resource-provenance>; 
         rel="http://www.w3.org/ns/prov#has_provenance"; 
         anchor="http://example.com/resource"

<html>
  <!-- ... -->
</html>

This request for http://example.com/resource.html returns some HTML, but also provides a Link: header that says that the provenance is located at http://example.com/resource-provenance.

Publication
PhD PhD thesis outline, University of Amsterdam
Reproducibility
Tracking workflow execution with TavernaProv

Apache Taverna is a scientific workflow system for combining web services and local tools. Taverna records provenance of workflow runs, intermediate values and user interactions, both as an aid for debugging while designing the workflow, but also as a record for later reproducibility and comparison.

Taverna also records provenance of the evolution of the workflow definition (including a chain of wasDerivedFrom relations), attributions and annotations; for brevity we here focus on how Taverna’s workflow run provenance extends PROV and is embedded with Research Objects.

FAIR Computational workflows Journal article published in Data Intelligence
Research object
Wf4Ever project Wf4Ever was a research object funded by EU Framework 7 to investigate how scientific workflows and their data could be better preserved for reproducibility, reuse and resiliance against workflow decay.
Tracking workflow execution with TavernaProv

Apache Taverna is a scientific workflow system for combining web services and local tools. Taverna records provenance of workflow runs, intermediate values and user interactions, both as an aid for debugging while designing the workflow, but also as a record for later reproducibility and comparison.

Taverna also records provenance of the evolution of the workflow definition (including a chain of wasDerivedFrom relations), attributions and annotations; for brevity we here focus on how Taverna’s workflow run provenance extends PROV and is embedded with Research Objects.

Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv Journal article published in GigaScience
Ro
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv Journal article published in GigaScience
Ro crate
BioHackEU23 report: Enabling FAIR Digital Objects with RO-Crate, Signposting and Bioschemas BioHackrXiv preprint from ELIXIR BioHackathon 2023
Enhancing Research Data Management in Galaxy and Data Stewardship Wizard by utilising RO-Crates BioHackrXiv preprint from ELIXIR BioHackathon 2022
Role
Multiple agents sharing roles

Assuming the task of writing provenance for a student group exercise, consider the question:

Do we need to assign everyone in the group a specific role since in our group we found that for many of the tasks, everyone worked together to complete it?

MSc Student in Understanding Data and their Environment, University of Manchester, 2020

This blog post explores the different PROV patterns that could describe this scenario.

Roles
Attribution vs association

A valid question when writing provenance in responsibility view and process view is. Should we attribute contributors from entities, isn’t that what the activities are showing? In this blog post we explore the different options.

Specially with roles it may seem unnecessary to also declare wasAttributedTo statements.

It is true that you can conclude from:

wasAttributed(ex:entity, ex:agent)

then there was some activity X such that:

wasGeneratedBy(ex:entity, X)  
wasAssociatedWith(X, ex:entity)

This conclusion follows from the constraint on agents and the definition of wasAttributedTo.

Scientific workflows
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv Journal article published in GigaScience
Signposting
BioHackEU23 report: Enabling FAIR Digital Objects with RO-Crate, Signposting and Bioschemas BioHackrXiv preprint from ELIXIR BioHackathon 2023
Locating provenance for a RESTful web service

This blog post shows how RESTful web services can provide, and link to, provenance data for their exposed resources by using the PROV-AQ mechanism of HTTP Link headers. This is demonstrated by showing how to update a hello world REST service implemented with Java and JAX-RS 2.0 to provide these links.

The  PROV-AQ HTTP mechanism is easiest explained by an example:

GET http://example.com/resource.html HTTP/1.1
Accept: text/html

HTTP/1.1 200 OK
Content-type: text/html
Link: <http://example.com/resource-provenance>; 
         rel="http://www.w3.org/ns/prov#has_provenance"; 
         anchor="http://example.com/resource"

<html>
  <!-- ... -->
</html>

This request for http://example.com/resource.html returns some HTML, but also provides a Link: header that says that the provenance is located at http://example.com/resource-provenance.

Software
FAIR Computational workflows Journal article published in Data Intelligence
Taverna
Wf4Ever project Wf4Ever was a research object funded by EU Framework 7 to investigate how scientific workflows and their data could be better preserved for reproducibility, reuse and resiliance against workflow decay.
Tools
Installing ProvToolbox on macOS

ProvToolbox is a useful command line tool for validating and visualizing PROV documents, but unfortunately it can be a bit of a challenge to install on Windows and on macOS because of its dependency requirements.

This post suggests three step-by-step methods of installing ProvToolbox on your Mac – you should follow the method you feel most comfortable with, but can try the other methods in case of problems.

Table of content

  1. Overview of requirements
    1. Software packaging for macOS
  2. Conda
    1. Installing Graphviz and OpenJDK with Conda
  3. HomeBrew
    1. Installing Graphviz with HomeBrew
    2. Installing OpenJDK with HomeBrew
  4. Installing manually
    1. Installing AdoptOpenJDK manually
    2. Installing Graphviz manually
  5. Installing ProvToolbox
    1. Using ProvToolbox from VSCode

Overview of requirements

As of 2020-08, ProvToolbox 0.9.5 is the latest release, which requires:

Validating and visualising PROV

This blog post gives a gentle PROV-N introdction and then explores tools for validating and visualising PROV.

One of the advantages of W3C PROV having a common data model is that it can be serialized, or written out, in multiple file formats. The PROV family of W3C specifications describe mappings PROV-XML and PROV-O (which, being based on OWL2 itself has multiple serializations, for Linked Data including RDF formats Turtle and JSON-LD.

In addition to these standard approaches we also have PROV-JSON and PROV-JSONLD which could be well-suited for Web applications. All of these can in theory be mapped to each-other through the common PROV Data Model and the use of URIs as Linked Data global identifiers.

Locating provenance for a RESTful web service

This blog post shows how RESTful web services can provide, and link to, provenance data for their exposed resources by using the PROV-AQ mechanism of HTTP Link headers. This is demonstrated by showing how to update a hello world REST service implemented with Java and JAX-RS 2.0 to provide these links.

The  PROV-AQ HTTP mechanism is easiest explained by an example:

GET http://example.com/resource.html HTTP/1.1
Accept: text/html

HTTP/1.1 200 OK
Content-type: text/html
Link: <http://example.com/resource-provenance>; 
         rel="http://www.w3.org/ns/prov#has_provenance"; 
         anchor="http://example.com/resource"

<html>
  <!-- ... -->
</html>

This request for http://example.com/resource.html returns some HTML, but also provides a Link: header that says that the provenance is located at http://example.com/resource-provenance.

Tutorial
Installing ProvToolbox on macOS

ProvToolbox is a useful command line tool for validating and visualizing PROV documents, but unfortunately it can be a bit of a challenge to install on Windows and on macOS because of its dependency requirements.

This post suggests three step-by-step methods of installing ProvToolbox on your Mac – you should follow the method you feel most comfortable with, but can try the other methods in case of problems.

Table of content

  1. Overview of requirements
    1. Software packaging for macOS
  2. Conda
    1. Installing Graphviz and OpenJDK with Conda
  3. HomeBrew
    1. Installing Graphviz with HomeBrew
    2. Installing OpenJDK with HomeBrew
  4. Installing manually
    1. Installing AdoptOpenJDK manually
    2. Installing Graphviz manually
  5. Installing ProvToolbox
    1. Using ProvToolbox from VSCode

Overview of requirements

As of 2020-08, ProvToolbox 0.9.5 is the latest release, which requires:

Installing ProvToolbox in Windows

While there are several tools available for validating and visualizing PROV, the ProvToolbox is perhaps the most useful for validating PROV-N syntax. However, the normal releases does not run in Windows due to a operating system restriction for command line and folder path length.

We have suggested a fix, but while we wait for that, here we describe a patch build that should work on Windows. We also show how to install dependencies: Java for executing ProvToolbox, and Graphviz for visualization. (See also macOS install).

Tracking versions with PAV

The PAV ontology specializes the W3C PROV-O standard to give a lightweight approach to recording details about a resource, giving its Provenance, Authorship and Versioning. Our paper on PAV explores all of these aspects in details. In this blog post we discuss Versioning as modelled by PAV, including their hierarchical organization.

Versioning is commonly used for software releases (e.g. Windows 8.1, Firefox 26, Python 3.3.2), but increasingly also for datasets and documents. For the purpose of provenance, a version number allows the declaration of the current state of a resource, which can be cross-checked against release notes and used for references, for instance to indicate which particular version of a dataset was used in producing an analysis report.

Tutorials
Locating provenance for a RESTful web service

This blog post shows how RESTful web services can provide, and link to, provenance data for their exposed resources by using the PROV-AQ mechanism of HTTP Link headers. This is demonstrated by showing how to update a hello world REST service implemented with Java and JAX-RS 2.0 to provide these links.

The  PROV-AQ HTTP mechanism is easiest explained by an example:

GET http://example.com/resource.html HTTP/1.1
Accept: text/html

HTTP/1.1 200 OK
Content-type: text/html
Link: <http://example.com/resource-provenance>; 
         rel="http://www.w3.org/ns/prov#has_provenance"; 
         anchor="http://example.com/resource"

<html>
  <!-- ... -->
</html>

This request for http://example.com/resource.html returns some HTML, but also provides a Link: header that says that the provenance is located at http://example.com/resource-provenance.

Tutorial on the W3C PROV family of specifications

Posted by Khalid Belhajjame

Provenance, a form of structured metadata designed to record the origin or source of information, can be instrumental in deciding whether information is to be trusted, how it can be integrated with other diverse information sources, and how to establish attribution of information to authors throughout its history.

The PROV set of specifications, produced by the World Wide Web Consortium (W3C), is designed to promote the publication of provenance information on the Web, and offers a basis for interoperability across diverse provenance management systems. The PROV provenance model is deliberately generic and domain-agnostic, but extension mechanisms are available and can be exploited for modelling specific domains.

URI
What are good PROV-N prefixes?

In this blog post we explore the role of PROV-N prefixes and how to decide on a good namespace to use your own custom provenance terms.

Most examples of PROV-N use example prefixes like:

prefix ex <http://example.com/>
prefix exg <http://example.org/government>

These example domains are explicitly reserved globally for all kinds of examples and training material, and deliberately do not have any content, advertisement or affiliations.

Assume you are writing the provenance of a student group exercise, should you be using the prefix/namespace ex and example.org to define agents/entities/relationship and your own attribute types?

Validation
Validating and visualising PROV

This blog post gives a gentle PROV-N introdction and then explores tools for validating and visualising PROV.

One of the advantages of W3C PROV having a common data model is that it can be serialized, or written out, in multiple file formats. The PROV family of W3C specifications describe mappings PROV-XML and PROV-O (which, being based on OWL2 itself has multiple serializations, for Linked Data including RDF formats Turtle and JSON-LD.

In addition to these standard approaches we also have PROV-JSON and PROV-JSONLD which could be well-suited for Web applications. All of these can in theory be mapped to each-other through the common PROV Data Model and the use of URIs as Linked Data global identifiers.

Versioning
Tracking versions with PAV

The PAV ontology specializes the W3C PROV-O standard to give a lightweight approach to recording details about a resource, giving its Provenance, Authorship and Versioning. Our paper on PAV explores all of these aspects in details. In this blog post we discuss Versioning as modelled by PAV, including their hierarchical organization.

Versioning is commonly used for software releases (e.g. Windows 8.1, Firefox 26, Python 3.3.2), but increasingly also for datasets and documents. For the purpose of provenance, a version number allows the declaration of the current state of a resource, which can be cross-checked against release notes and used for references, for instance to indicate which particular version of a dataset was used in producing an analysis report.

Vocabulary
Recording authorship, curation and digital creation with the PAV ontology PAV is a lightweight ontology for tracking Provenance, Authoring and Versioning.  PAV supplies terms for distinguishing between the different roles of the agents contributing content in current web based systems: contributors, authors, curators and digital artifact creators. The ontology also provides terms for tracking provenance of digital entities that are published on the web and then accessed, transformed and consumed.
W3C
PROV released as W3C Recommendations

The Provenance Working Group was chartered to develop a framework for interchanging provenance on the Web. The Working Group has now published the PROV Family of Documents as W3C Recommendations, along with corresponding supporting notes. You can find a complete list of the documents in the PROV Overview Note.
PROV enables one to represent and interchange provenance information using widely available formats such as RDF and XML. In addition, it provides definitions for accessing provenance information, validating it, and mapping to Dublin Core. Learn more about the Semantic Web.

W3C provenance working group
W3C PROV Implementations: Preliminary Analysis

By Khalid Belhajjame, syndicated from https://khalidbelhajjame.wordpress.com/2013/04/04/w3c-prov-implementations/

In the beginning of December 2012, the W3C Provenance Working Group issued a call for implementations. As of February the 25th 2013, 64 PROV implementations were reported to the W3C Provenance Working Group.

These implementations took different forms ranging from stand alone applications (30), to reusable frameworks and libraries (10), to services hosted by third parties (9), to vocabularies (21), and constraints validation modules (3).

Wf4ever
Wf4Ever project Wf4Ever was a research object funded by EU Framework 7 to investigate how scientific workflows and their data could be better preserved for reproducibility, reuse and resiliance against workflow decay.
Windows
Installing ProvToolbox in Windows

While there are several tools available for validating and visualizing PROV, the ProvToolbox is perhaps the most useful for validating PROV-N syntax. However, the normal releases does not run in Windows due to a operating system restriction for command line and folder path length.

We have suggested a fix, but while we wait for that, here we describe a patch build that should work on Windows. We also show how to install dependencies: Java for executing ProvToolbox, and Graphviz for visualization. (See also macOS install).

Workflow
Enhancing Research Data Management in Galaxy and Data Stewardship Wizard by utilising RO-Crates BioHackrXiv preprint from ELIXIR BioHackathon 2022
Applying the FAIR Principles to Computational Workflows Journal article published in Scientific Data
FAIR Computational workflows Journal article published in Data Intelligence
WorkflowHub: a registry for computational workflows arXiv preprint
Workflows
Wf4Ever project Wf4Ever was a research object funded by EU Framework 7 to investigate how scientific workflows and their data could be better preserved for reproducibility, reuse and resiliance against workflow decay.
Tracking workflow execution with TavernaProv

Apache Taverna is a scientific workflow system for combining web services and local tools. Taverna records provenance of workflow runs, intermediate values and user interactions, both as an aid for debugging while designing the workflow, but also as a record for later reproducibility and comparison.

Taverna also records provenance of the evolution of the workflow definition (including a chain of wasDerivedFrom relations), attributions and annotations; for brevity we here focus on how Taverna’s workflow run provenance extends PROV and is embedded with Research Objects.

World wide web consortium
W3C PROV Implementations: Preliminary Analysis

By Khalid Belhajjame, syndicated from https://khalidbelhajjame.wordpress.com/2013/04/04/w3c-prov-implementations/

In the beginning of December 2012, the W3C Provenance Working Group issued a call for implementations. As of February the 25th 2013, 64 PROV implementations were reported to the W3C Provenance Working Group.

These implementations took different forms ranging from stand alone applications (30), to reusable frameworks and libraries (10), to services hosted by third parties (9), to vocabularies (21), and constraints validation modules (3).