Full Professor, TU Darmstadt
Databases Unleashed - Rethinking Relational Databases in the Age of LLMs
Abstract: Relational databases have been a cornerstone of data management since the 1970s and are used today in virtually every field. One of Edgar Codd's original motivations for the relational model was its simplicity, providing a user-friendly data format and a declarative query language (SQL) that hides low-level complexities from users. However, despite their continued success, I argue that relational databases that implement the relational model have, in many ways, failed to deliver on this original promise. In particular, with the advent of foundational AI models and LLMs capable of querying various data formats, including text and images, without the need to transform them into rigid tabular formats in the first place, the traditional relational paradigm seems increasingly outdated. Therefore, the question naturally arises of whether we will still need relational databases or whether LLMs will not be the "better databases" in the future. In my talk, I outline my vision of future databases, where I propose a design that combines the best of "both worlds" by using a relational core internally to ensure scalable, efficient, and explainable query execution (which foundational AI models do not provide by any means) while an LLM shim surrounds the relational core and allows users to directly query arbitrary data while enabling rich query capabilities that go beyond what SQL can do today.
Bio: Carsten Binnig is a Full Professor in the Computer Science Department at TU Darmstadt and was recently a Visiting Researcher with the Google Systems Research Group. He earned his Ph.D. from the University of Heidelberg in 2008, after which he worked as a postdoctoral researcher in the Systems Group at ETH Zurich and SAP, focusing on in-memory databases. His current research explores the design of scalable data systems for modern hardware and machine learning for scalable data systems. For his work, he has also received numerous prestigious awards, including Best Paper and Best Demo awards at top conferences such as SIGMOD, VLDB, and CIDR, as well as a LOEWE top professorship from the state of Hesse.
Associate Professor, UC Berkeley
LLM-powered Data Tooling: the Next Frontier
Abstract: LLMs are changing the world, but how can they help with data processing? In this talk, we discuss ongoing work in the EPIC Data Lab at Berkeley to rethink the end-to-end data lifecycle, now with LLMs in the mix. We describe our scalable, efficient, and usable text data processing system stack, aka our document "stack stack", as well as a couple of projects that are having impact across a number of real-world domains.
Bio: Aditya Parameswaran is an Associate Professor in Computer Science at UC Berkeley, and a co-director of the EPIC Data Lab. Aditya has published 100+ papers in top-tier venues in data management, human-computer interaction, and visualization, with multiple best paper awards. Multiple open-source tools developed in his group have received thousands of GitHub stars (including Modin, Lux, IPyFlow, DocETL)---and have been downloaded tens of millions of times overall across a spectrum of industries. His research was commercialized as a startup, Ponder, in 2021, where he served as Co-founder and President, before its acquisition by Snowflake. Aditya has received the Alfred P. Sloan Research Fellowship, VLDB Early Career Award, the NSF CAREER Award, the TCDE Rising Star Award, along with other recognitions. His website is at http://adityagp.net.