The PROV working group received a question from Mike:
My understanding is that an entity referenced in a PROV bundle (e.g. via wasGeneratedBy) must be in the bundle…but I do not wish to duplicate entity definitions through out my bundles. My entities are long lived and will exist in multiple bundles.
So lets say I have a resource for alarms which contains a list of all alarms my company monitors. If I turn off the alarm at alarm/1, my understanding is that in PROV a new entity is created for the new state of alarm/1.
But in my actual data store, I don’t create a new record, I just toggle a flag. So there is a disconnect between how my PROV looks and how my data looks. This is by design is my understanding.
So I would have a new entity in my prov for the alarm/1 in the new state which is a specialization of alarm/1, yes? Ultimately, I want to display all of the provenance for alarm/1 so I can see its history from creation to invalidation. Am I going about this the wrong way?
The Provenance Working Group was chartered to develop a framework for interchanging provenance on the Web. The Working Group has now published the PROV Family of Documents as W3C Recommendations, along with corresponding supporting notes. You can find a complete list of the documents in the PROV Overview Note.
PROV enables one to represent and interchange provenance information using widely available formats such as RDF and XML. In addition, it provides definitions for accessing provenance information, validating it, and mapping to Dublin Core. Learn more about the Semantic Web.
This blog post shows how RESTful web services can provide, and link to, provenance data for their exposed resources by using the PROV-AQ mechanism of HTTP Link headers. This is demonstrated by showing how to update a hello world REST service implemented with Java and JAX-RS 2.0 to provide these links.
GET http://example.com/resource.html HTTP/1.1Accept:text/htmlHTTP/1.1 200 OK
Content-type: text/html
Link: <http://example.com/resource-provenance>;
rel="http://www.w3.org/ns/prov#has_provenance";
anchor="http://example.com/resource"
<html>
<!-- ... -->
</html>
This request for http://example.com/resource.html returns some HTML, but also provides a Link: header that says that the provenance is located at http://example.com/resource-provenance.
In the beginning of December 2012, the W3CProvenance Working Group issued a call for implementations. As of February the 25th 2013, 64 PROV implementations were reported to the W3C Provenance Working Group.
These implementations took different forms ranging from stand alone applications (30), to reusable frameworks and libraries (10), to services hosted by third parties (9), to vocabularies (21), and constraints validation modules (3).
PAV is a lightweight ontology for tracking Provenance, Authoring and Versioning. PAV supplies terms for distinguishing between the different roles of the agents contributing content in current web based systems: contributors, authors, curators and digital artifact creators. The ontology also provides terms for tracking provenance of digital entities that are published on the web and then accessed, transformed and consumed.
Provenance, a form of structured metadata designed to record the origin or source of information, can be instrumental in deciding whether information is to be trusted, how it can be integrated with other diverse information sources, and how to establish attribution of information to authors throughout its history.
The PROV set of specifications, produced by the World Wide Web Consortium (W3C), is designed to promote the publication of provenance information on the Web, and offers a basis for interoperability across diverse provenance management systems. The PROV provenance model is deliberately generic and domain-agnostic, but extension mechanisms are available and can be exploited for modelling specific domains.
Scholars have made handwritten notes and comments in books and manuscripts for
centuries. Today’s blogs and news sites typically invite users to
express their opinions on the published content; URLs allow web
resources to be shared with accompanying annotations and comments
using third-party services like Twitter or Facebook. These
contributions have until recently been constrained within specific
services, making them second-class citizens of the Web.