Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 48 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 107 tok/s Pro
Kimi K2 205 tok/s Pro
GPT OSS 120B 473 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Enhanced Inversion of Schema Evolution with Provenance (2211.13810v1)

Published 24 Nov 2022 in cs.DB

Abstract: Long-term data-driven studies have become indispensable in many areas of science. Often, the data formats, structures and semantics of data change over time, the data sets evolve. Therefore, studies over several decades in particular have to consider changing database schemas. The evolution of these databases lead at some point to a large number of schemas, which have to be stored and managed, costly and time-consuming. However, in the sense of reproducibility of research data each database version must be reconstructable with little effort. So a previously published result can be validated and reproduced at any time. Nevertheless, in many cases, such an evolution can not be fully reconstructed. This article classifies the 15 most frequently used schema modification operators and defines the associated inverses for each operation. For avoiding an information loss, it furthermore defines which additional provenance information have to be stored. We define four classes dealing with dangling tuples, duplicates and provenance-invariant operators. Each class will be presented by one representative. By using and extending the theory of schema mappings and their inverses for queries, data analysis, why-provenance, and schema evolution, we are able to combine data analysis applications with provenance under evolving database structures, in order to enable the reproducibility of scientific results over longer periods of time. While most of the inverses of schema mappings used for analysis or evolution are not exact, but only quasi-inverses, adding provenance information enables us to reconstruct a sub-database of research data that is sufficient to guarantee reproducibility.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.