This vision paper presents a new generation of multimodal streaming systems that embed Multimodal Large Language Models (MLLMs) as first-class operators, enabling real-time query processing across multiple modalities. While recent work has integrated …
Learned cost models (LCMs) have recently gained traction as a promising alternative to traditional cost estimation techniques in data management, offering improved accuracy by capturing complex interactions between queries, data, and runtime …
This paper presents the very first approach for opening the black box by bringing AI explainability approaches to Learned Cost Models (LCMs). New explanation techniques are proposed that extend and adapt existing methods for the general …
Query optimizers traditionally rely on cost models to choose the best execution plan, and while machine learning-based cost models have been proposed to overcome weaknesses of traditional models, limited efforts have been made to investigate how well …
COSTREAM provides a learned cost model for Distributed Stream Processing Systems that can accurately predict the execution costs of a streaming query in an edge-cloud environment. The model can be used to find an initial placement of operators across …
ZERoTuNE introduces a novel cost model for parallel and distributed stream processing that can be used to effectively set initial parallelism degrees of streaming queries. Unlike existing models, which rely majorly on online learning statistics that …
This paper presents zero-shot cost models for parallel stream processing, enabling accurate cost predictions for parallel streaming queries without having observed any query deployment. The approach leverages data-efficient zero-shot learning …
This tutorial workshop at BTW 2023 explores the intersection of machine learning and systems, covering both the application of ML techniques to optimize and improve systems (ML for Systems) as well as systems designed to support and accelerate ML …
Deep learning (DL) inference has become an essential building block in modern intelligent applications. Due to the high computational intensity of DL, it is critical to scale DL inference serving systems in response to fluctuating workloads to …
Distributed Stream Processing (DSP) systems highly rely on parallelism mechanisms to deliver high performance in terms of latency and throughput. Yet the development of such parallel systems altogether comes with numerous challenges. In this paper, …