
<?xml version="1.0" encoding="UTF-8"?>
<record>
  <title>Data Provenance Architecture Supporting Environmental Monitoring Processes </title>
  <journal>International Journal of Web Applications</journal>
  <author>Daniel Lins da Silva, AndrÃ© Batista, Pedro Luiz Pizzigatti CorrÃªa</author>
  <volume>8</volume>
  <issue>3</issue>
  <year>2016</year>
  <doi></doi>
  <url></url>
  <abstract>Long-term research and environmental monitoring are essential for the improved management of ecosystems and natural resources. However, to reuse this data for new experiments, decision-making processes, and integrate these data with other long-term initiatives, scientists need more information related to data creation and its evolution, intellectual property rights, and technical information in order to evaluate the use of this data. Provenance metadata emerges as a way to evaluate the quality and reliability of data, audit processes and the data versioning, while enabling the data reuse and the reproducibility of experiments and analysis. However, most solutions for the capture and management of provenance metadata are based on specific tools, restricted scopes, and they are difficult to apply in distributed and heterogeneous environments. In this paper, we present an approach for capturing, managing, and publishing the provenance metadata generated in the environmental monitoring processes. Our computational architecture comprises three main components: (1) a data model based in PROV-DM and Dublin Core; (2) a repository of RDF Graphs; and (3) a Web API that provides services for collecting, storing, and querying provenance metadata. We demonstrate the application of our approach and show its practical usefulness by evaluating this architecture to manage provenance metadata generated during an environmental monitoring simulation. The results show that our approach is effective in collecting and storing provenance metadata and allows the query of an entire provenance of datasets and data products, thus enabling reuse, discovery, and visualization of raw data, processes, and scientists involved in its generation and evolution. </abstract>
</record>
