"stream processing"

Towards a Multimodal Stream Processing System

This vision paper presents a new generation of multimodal streaming systems that embed Multimodal Large Language Models (MLLMs) as first-class operators, enabling real-time query processing across multiple modalities. While recent work has integrated …

Learned Cost Models for Query Optimization: From Batch to Streaming Systems

Learned cost models (LCMs) have recently gained traction as a promising alternative to traditional cost estimation techniques in data management, offering improved accuracy by capturing complex interactions between queries, data, and runtime …

Dema: Efficient Decentralized Aggregation for Non-Decomposable Quantile Functions

The growing number of IoT devices has led to decentralized networks for handling unbounded data streams, but traditional centralized window aggregation results in high network overhead and processing bottlenecks. Current decentralized solutions only …

PDSP-Bench: A Benchmarking System for Parallel and Distributed Stream Processing

PDSP-Bench is a novel benchmarking system designed for a systematic understanding of performance of parallel stream processing in a distributed environment. While existing benchmarking systems focus on analyzing stream processing systems using …

COSTREAM: Learned Cost Models for Operator Placement in Edge-Cloud Environments

COSTREAM provides a learned cost model for Distributed Stream Processing Systems that can accurately predict the execution costs of a streaming query in an edge-cloud environment. The model can be used to find an initial placement of operators across …

ZERoTuNE: Learned Zero-Shot Cost Models for Parallelism Tuning in Stream Processing

ZERoTuNE introduces a novel cost model for parallel and distributed stream processing that can be used to effectively set initial parallelism degrees of streaming queries. Unlike existing models, which rely majorly on online learning statistics that …

No One Size (PPM) Fits All: Towards Privacy in Stream Processing Systems

Stream processing systems designed to process data streams in real-time must handle sensitive or personal data across multilayered systems (sensor, fog, and cloud layers), which raises privacy concerns as data may be subject to unauthorized access …

Zero-Shot Cost Models for Parallel Stream Processing

This paper presents zero-shot cost models for parallel stream processing, enabling accurate cost predictions for parallel streaming queries without having observed any query deployment. The approach leverages data-efficient zero-shot learning …

PANDA: Performance Prediction for Parallel and Dynamic Stream Processing

Distributed Stream Processing (DSP) systems highly rely on parallelism mechanisms to deliver high performance in terms of latency and throughput. Yet the development of such parallel systems altogether comes with numerous challenges. In this paper, …

Zero-shot cost models for Distributed Stream Processing

This paper proposes a learned cost estimation model for Distributed Stream Processing Systems (DSPS) with an aim to provide accurate cost predictions of executing queries. A major premise of this work is that the proposed learned model can generalize …