Learning the Index vs. Indexing the Learned Models? A Tutorial on Learned Multidimensional Indexes

In collaboration with the UMD Center for Machine Learning

Date and Time of the talk: April 1 2021, 9:30 AM EDT

Information of the Speaker

Walid G. Aref, Purdue University and Alexandria University-Egypt

A Ph.D. graduate from the University of Maryland (1993), Professor Walid G. Aref’s research interests are in extending the functionality of database systems in support of emerging applications, e.g., spatial, spatio-temporal, graph, biological, and sensor databases. He is also interested in query processing, indexing, data streaming, and geographic information systems (GIS). His research has been supported by the National Science Foundation, the National Institutes of Health, Purdue Research Foundation, CERIAS, Panasonic, and Microsoft Corp. In 2001, he received the CAREER Award from the National Science Foundation and in 2004, he received a Purdue University Faculty Scholar award. Walid is a member of Purdue’s CERIAS. He is the Editor-in-Chief of the ACM Transactions of Spatial Algorithms and Systems (ACM TSAS), an editorial board member of the Journal of Spatial Information Science (JOSIS), and has served as an editor of the VLDB Journal and the ACM Transactions of Database Systems (ACM TODS). He has won several best paper awards including the 2016 VLDB ten-year best paper award. He is a Fellow of the IEEE, and a member of the ACM. Between 2011 and 2014, he served as the chair of the ACM Special Interest Group on Spatial Information (SIGSPATIAL).

Abdullah Al Mamun, Purdue University Abdullah Al Mamun is a doctoral student in the Department of Computer Science, Purdue University. His research interests are in the area of Learned Index Structures, particularly, in the area of Learned Multidimensional and Spatial Indexes. Previously, he completed his MS in CS from Memorial University of Newfoundland, Canada.

Abstract

Recently, machine learning has been successfully applied to database indexing. Initial experimentation on Learned Indexes has demonstrated better search performance and lower space requirements than their traditional database counterparts. Numerous attempts have been explored to extend learned indexes to the multi-dimensional space. This makes learned indexes potentially suitable for spatial databases. The goal of the tutorial is to provide up-to-date coverage of learned indexes both in the single and multidimensional spaces with an emphasis on the latter. The tutorial navigates the space of learned indexes through a taxonomy that distinguishes between learning the index and indexing the learned models. The taxonomy classifies the learned indexes further into static and dynamic ones based on whether or not they support updates to the underlying data after having learned the indexed data. Toward the end, the tutorial highlights some research challenges and potential directions for future research.