Tutorial on the W3C PROV family of specifications

Posted by Khalid Belhajjame

Provenance, a form of structured metadata designed to record the origin or source of information, can be instrumental in deciding whether information is to be trusted, how it can be integrated with other diverse information sources, and how to establish attribution of information to authors throughout its history.

The PROV set of specifications, produced by the World Wide Web Consortium (W3C), is designed to promote the publication of provenance information on the Web, and offers a basis for interoperability across diverse provenance management systems. The PROV provenance model is deliberately generic and domain-agnostic, but extension mechanisms are available and can be exploited for modelling specific domains.

Paolo Missier, Khalid Belhajjame and James Cheny gave a tutorial at the EDBT conference on 2013-03-20 in Genova, Italy. The tutorial provided an account of these specifications. Starting from intuitive and informal examples that present idiomatic provenance patterns, it progressively introduces the relational model of provenance along with the constraints model for validation of provenance documents, and concludes with example applications that show the extension points in use.

Tutorial material

The tutorial is in three parts, each about 30 minutes long, and consists of the following material:

There is also a short paper describing the motivation, structure and content of the tutorial, published in the EDBT'13 proceedings: The W3C PROV family of specifications for modelling provenance metadata, Paolo Missier, Khalid Belhajjame, and James Cheney [archived 2013-05-13] https://doi.org/10.1145/2452376.2452478