Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Unifying Perspective on Succinct Data Representations (2309.11663v2)

Published 20 Sep 2023 in cs.DB and cs.FL

Abstract: Factorized representations (FRs) are a well-known tool to succinctly represent results of join queries and have been originally defined using the named database perspective. We define FRs in the unnamed database perspective and use them to establish several new connections. First, unnamed FRs can be exponentially more succinct than named FRs, but this difference can be alleviated by imposing a disjointness condition on columns. Conversely, named FRs can also be exponentially more succinct than unnamed FRs. Second, unnamed FRs are the same as (i.e., isomorphic to) context-free grammars for languages in which each word has the same length. This tight connection allows us to transfer a wide range of results on context-free grammars to database factorization; of which we offer a selection in the paper. Third, when we generalize unnamed FRs to arbitrary sets of tuples, they become a generalization of \emph{path multiset representations}, a formalism that was recently introduced to succinctly represent sets of paths in the context of graph database query evaluation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Enumeration on trees with tractable combined complexity and efficient updates. In Symposium on Principles of Database Systems (PODS), pages 89–103. ACM, 2019.
  2. Database Theory. Open source at https://github.com/pdm-book/community, 2022.
  3. #NFA admits an FPRAS: efficient enumeration, counting, and uniform generation for logspace classes. J. ACM, 68(6):48:1–48:40, 2021.
  4. When is approximate counting for conjunctive queries tractable? In Symposium on Theory of Computing (STOC), pages 1015–1027. ACM, 2021.
  5. Aggregation and ordering in factorised databases. Proc. VLDB Endow., 6(14):1990–2001, 2013.
  6. FDB: A query engine for factorised relational databases. Proc. VLDB Endow., 5(11):1232–1243, 2012.
  7. Boundedness of conjunctive regular path queries. In International Colloquium on Automata, Languages, and Programming (ICALP).
  8. Expressive languages for path queries over graph-structured data. In Symposium on Principles of Database Systems (PODS), pages 3–14. ACM, 2010.
  9. Querying graph patterns. In Symposium on Principles of Database Systems (PODS), pages 199–210. ACM, 2011.
  10. Rewriting of regular expressions and regular path queries. In Symposium on Principles of Database Systems (PODS), pages 194–204. ACM Press, 1999.
  11. Containment of conjunctive regular path queries with inverse. In International Conference on Principles of Knowledge Representation and Reasoning (KR), pages 176–185. Morgan Kaufmann, 2000.
  12. Tractable orders for direct access to ranked answers of conjunctive queries. ACM Trans. Database Syst., 48(1):1:1–1:45, 2023.
  13. Answering (unions of) conjunctive queries using random access and random-order enumeration. ACM Trans. Database Syst., 47(3):9:1–9:49, 2022.
  14. Noam Chomsky. On certain formal properties of grammars. Inf. Control., 2(2):137–167, 1959.
  15. GraphLog: a visual formalism for real life recursion. In Symposium on Principles of Database Systems (PODS), pages 404–416, 1990.
  16. A graphical query language supporting recursion. In International Conference on Management of Data (SIGMOD), pages 323–330, 1987.
  17. Graph pattern matching in GQL and SQL/PGQ. In International Conference on Management of Data (SIGMOD), pages 2246–2258. ACM, 2022.
  18. Containment of simple conjunctive regular path queries. In International Conference on Principles of Knowledge Representation and Reasoning (KR), pages 371–380, 2020.
  19. A researcher’s digest of GQL (invited talk). In International Conference on Database Theory (ICDT).
  20. Cypher: An evolving query language for property graphs. In International Conference on Management of Data (SIGMOD), pages 1433–1445. ACM, 2018.
  21. Interpretable and informative explanations of outcomes. Proc. VLDB Endow., 8(1):61–72, 2014.
  22. GQL. https://www.gqlstandards.org/, 2023.
  23. Sheila A. Greibach. A new normal-form theorem for context-free phrase structure grammars. J. ACM, 12(1):42–52, 1965.
  24. ISO. Information technology - database languages SQL - Part 16: Property graph queries (SQL/PGQ), 2023.
  25. Counting and random generation of strings in regular languages. In Symposium on Discrete Algorithms (SODA), pages 551–557, 1995.
  26. Conjunctive queries with free access patterns under updates. In International Conference on Database Theory (ICDT).
  27. Greibach normal form transformation, revisited. In International Symposium on Theoretical Aspects of Computer Science (STACS), pages 47–54, 1997.
  28. The complexity of regular expressions and property paths in SPARQL. ACM Transactions on Database Systems, 38(4):24:1–24:39, 2013.
  29. LR-parsing of extended context free grammars. Acta Informatica, 7:61–73, 1976.
  30. Representing paths in graph database pattern matching. Proc. VLDB Endow., 16(7):1790–1803, 2023.
  31. Representing paths in graph database pattern matching. CoRR, abs/2207.13541, 2022.
  32. A trichotomy for regular trail queries. In International Symposium on Theoretical Aspects of Computer Science (STACS).
  33. The complexity of regular trail and simple path queries on undirected graphs. In Symposium on Principles of Database Systems (PODS), pages 165–174. ACM, 2022.
  34. Evaluation and enumeration problems for regular path queries. In International Conference on Database Theory (ICDT).
  35. The equivalence problem for regular expressions with squaring requires exponential space. In SWAT (FOCS), pages 125–129. IEEE Computer Society, 1972.
  36. Neo4j. Intro to Cypher. https://neo4j.com/developer/cypher-query-language/, 2017.
  37. Incremental view maintenance with triple lock factorization benefits. In International Conference on Management of Data (SIGMOD), pages 365–380, 2018.
  38. F: regression models over factorized views. Proc. VLDB Endow., 9(13):1573–1576, 2016.
  39. Factorized databases. SIGMOD Rec., 45(2):5–16, 2016.
  40. Factorised representations of query results: size bounds and readability. In International Conference on Database Theory (ICDT), pages 285–298. ACM, 2012.
  41. Size bounds for factorised representations of query results. ACM Trans. Database Syst., 40(1):2:1–2:44, 2015.
  42. The Rel language. https://docs.relational.ai/rel, 2023.
  43. The Rel language (relations). https://docs.relational.ai/rel/primer/basic-syntax#relations, 2023.
  44. Learning linear regression models over factorized joins. In International Conference on Management of Data (SIGMOD), pages 3–18, 2016.
  45. Markus L. Schmid. Conjunctive regular path queries with string variables. In Symposium on Principles of Database Systems (PODS), pages 361–374. ACM, 2020.
  46. Szymon Torunczyk. Aggregate queries on sparse databases. In Symposium on Principles of Database Systems (PODS), pages 427–443. ACM, 2020.
  47. Wen-Guey Tzeng. On path equivalence of nondeterministic finite automata. Inf. Process. Lett., 58(1):43–46, 1996.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Benny Kimelfeld (57 papers)
  2. Wim Martens (22 papers)
  3. Matthias Niewerth (10 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.