Hedgelog: Datalog on Monoids
Location
New Computer Science, Engineering Dr, Stony Brook, NY 11794, USA
Event Description
Abstract: Datalog is a powerful language for expressing recursive computations through rules: Horn clauses in first order logic. Although effective at expressing queries over existential properties, Datalog and many of its popular implementations struggle with queries that involve more complex aggregates, requiring users to apply verbose, non-composable, and/or inefficient workarounds. Recent work on lattice-based datalogs addresses many of these concerns for aggregates that can be encoded as lattices (e.g., min or max), but more general aggregates like count remain problematic. In this talk, I will argue that this is not a fundamental limitation of Datalog, but rather from its model of truth: Both datalog semantics and evaluation rules make heavy use of the fact that insertion is both monotone and idempotent. Once a fact is known to be true, it can not be retracted, nor can further discoveries of the same fact alter its truth. Monotonicity is critical for forward progress under Datalog's ``open world'' model, as it allows us to safely assert the truth of a body. Meanwhile, idempotence makes it easier to reason about evaluation, as we need only guarantee that each head atom will be derived at-least-once. Unfortunately, more general aggregates like sum() are neither idempotent, nor monotone. I will introduce Hedgelog, a strict generalization of Datalog that uses general monoids as a basis for truth. I will show that this generalization remains compatible with Datalog's open world model, how it enables cleaner and more composable datalog programs, and how the underlying monoid relations open the door to interesting datastructure-level optimizations.
Bio: Oliver Kennedy is an associate professor at the University at Buffalo. He earned his PhD from Cornell University in 2011 and now leads the Online Data Interactions (ODIn) lab, which operates at the intersection of databases and programming languages. Oliver is the recipient of an NSF CAREER award, an IEEE Region 1 Technological Innovation Award, UB's Exceptional Scholar Award, and several UB SEAS teaching awards. Oliver is also one of the founding board members of Breadcrumb Analytics. Several of Oliver's papers have been invited to Best of compilations from SIGMOD and VLDB. The ODIn lab is currently exploring (i) how we can leverage database techniques like incremental view maintenance to make compilers faster, (ii) how to make it easier for data scientists to track how sources of uncertainty, ambiguity, and/or bias affect analyses, and (iii) how to streamline the interfaces --- both human and software --- between different tools for data science, like python, sql, and spreadsheets.
Location: NCS 120
Bio: Oliver Kennedy is an associate professor at the University at Buffalo. He earned his PhD from Cornell University in 2011 and now leads the Online Data Interactions (ODIn) lab, which operates at the intersection of databases and programming languages. Oliver is the recipient of an NSF CAREER award, an IEEE Region 1 Technological Innovation Award, UB's Exceptional Scholar Award, and several UB SEAS teaching awards. Oliver is also one of the founding board members of Breadcrumb Analytics. Several of Oliver's papers have been invited to Best of compilations from SIGMOD and VLDB. The ODIn lab is currently exploring (i) how we can leverage database techniques like incremental view maintenance to make compilers faster, (ii) how to make it easier for data scientists to track how sources of uncertainty, ambiguity, and/or bias affect analyses, and (iii) how to streamline the interfaces --- both human and software --- between different tools for data science, like python, sql, and spreadsheets.
Location: NCS 120