Publications

Deep Learning for Relational Databases

Authors: Jakub Peleška

Diploma Thesis • 2024

Relational Databases store the majority of the world's data. However, their use in deep learning is greatly underutilized. This thesis explores the integration of deep learning with relational databases, leveraging the intricate interconnections of the stored values. Recent advancements in AI, particularly in deep learning models like transformers and CNNs, have revolutionized fields of natural language processing and computer vision through their ability to process homogeneous data. Nevertheless, relational database data are inherently heterogeneous and structured, posing challenges for traditional deep-learning approaches. This research addresses the obstacle of data representation by viewing relational databases as heterogeneous tabular graphs, aligning with recent successes in graph neural networks. The proposed blueprint lays down a foundation for deep learning on relational databases. The neural architecture space of the blueprint allows for the employment of existing tabular models and, importantly, the sequence processing transformers. The presented Database Transformer highlights the strength of this framework by displaying promising results that outperform existing state-of-the-art methods.

Read Paper

Tabular Transformers Meet Relational Databases

Authors: Jakub Peleška, Gustav Šír

ACM Trans. Intell. Syst. Technol. • 2025

Transformer models have continuously expanded into all machine learning domains convertible to the underlying sequence-to-sequence representation, including tabular data. However, while ubiquitous, this representation restricts their extension to the more general case of relational databases. In this paper, we introduce a modular neural message-passing scheme that closely adheres to the formal relational model, enabling direct end-to-end learning of tabular Transformers from database storage systems. We address the challenges of appropriate learning data representation and loading, which are critical in the database setting, and compare our approach against a number of representative models from various related fields across a significantly wide range of datasets. Our results demonstrate a superior performance of this newly proposed class of neural architectures.

Read Paper

Assessing Explainability Methods for AI Safety Governance

Authors: Martin Krutský, Jiří Němeček, Paula Gürtler, Jakub Peleška, Gustav Šír

International Conference on Large-Scale AI Risks • 2025

AI safety has not only been a matter of academic but also of public interest, culminating in a call for an AI moratorium in March 2023. Governments around the world have since taken action to promote the safe development of AI. For example, the UK, the US, and Japan have founded National AI Safety Institutes (AISIs), and other countries have followed since. AISIs are tasked with the safety evaluation of advanced AI systems, contributing to standards with technical expertise, and strengthening international cooperation. The EU AI Office has largely similar roles to AISIs, with the additional mandate to support the implementation and enforcement of the AI Act. The success of governance initiatives for AI safety depends on three components: i) specification of technical and legal means by which governance initiatives, such as the AI Act, shall be implemented; ii) stringent legal enforcement of regulation; iii) meaningful human oversight. However, safety criteria are impossible to fully specify in technical terms and to enforce at scale for current AI models that are deployed in dynamically changing, complex environments. We thus propose approaching i) standard specification, ii) auditing, and iii) continuous human oversight with extit{explainable AI} (XAI) methods that allow stakeholders to react flexibly as new safety concerns arise. Arguing that not all existing XAI methods are equally beneficial for AI safety, we review their usage in safety-related case studies based on AI Act classification and propose a 5-dimensional framework for their assessment. The first two dimensions are understandability to subjects and auditors. While (non-expert) subjects require simple and succinct justifications, auditors can be expected to process more involved explanations utilizing, e.g., their understanding of statistics. The third dimension is veracity, which measures how accurately an XAI method represents the model's behavior. The penultimate dimension is actionability, which evaluates the helpfulness of an explanation in resolving possible undesired behaviors of an AI model. Finally, there is scalability, ensuring that even the largest state-of-the-art models can be explained with an XAI method. We suggest that a joint evaluation of the presented (possibly interdependent) dimensions is an essential part of a holistic approach to AI governance, bridging the gap between technical development and regulation.

Read Paper

REDELEX: A Framework for Relational Deep Learning Exploration

Authors: Jakub Peleška, Gustav Šír

ECML PKDD 2025 • 2025

Relational databases (RDBs) are widely regarded as the gold standard for storing structured information. Consequently, predictive tasks leveraging this data format hold significant application promise. Recently, Relational Deep Learning (RDL) has emerged as a novel paradigm wherein RDBs are conceptualized as graph structures, enabling the application of various graph neural architectures to effectively address these tasks. However, given its novelty, there is a lack of analysis into the relationships between the performance of various RDL models and the characteristics of the underlying RDBs. In this study, we present REDELEX—a comprehensive exploration framework for evaluating RDL models of varying complexity on the most diverse collection of over 70 RDBs, which we make available to the community. Benchmarked alongside key representatives of classic methods, we confirm the generally superior performance of RDL while providing insights into the main factors shaping performance, including model complexity, database sizes and their structural properties.

Read Paper

Task-Agnostic Contrastive Pretraining for Relational Deep Learning

Authors: Jakub Peleška, Gustav Šír

MLG 2025 • 2025

Relational Deep Learning (RDL) is an emerging paradigm that leverages Graph Neural Network principles to learn directly from relational databases by representing them as heterogeneous graphs. However, existing RDL models typically rely on task-specific supervised learning, requiring training separate models for each predictive task, which may hamper scalability and reuse. In this work, we propose a novel task-agnostic contrastive pretraining approach for RDL that enables database-wide representation learning. For that aim, we introduce three levels of contrastive objectives—row-level, link-level, and context-leveldesigned to capture the structural and semantic heterogeneity inherent to relational data. We implement the respective pretraining approach through a modular RDL architecture and an efficient sampling strategy tailored to the heterogeneous database setting. Our preliminary results on standard RDL benchmarks demonstrate that fine-tuning the pretrained models measurably outperforms training from scratch, validating the promise of the proposed methodology in learning transferable representations for relational data.

Read Paper

XAI Desiderata for Trustworthy AI: Insights from the AI Act

Authors: Martin Krutský, Jiří Němeček, Jakub Peleška, Paula Gürtler, Gustav Šír

TRUST AI 2025 • 2025

Explainable AI (XAI) is an actively growing field. When choosing a suitable XAI method, one can get overwhelmed by the number of existing approaches, their properties, and taxonomies. In this paper, we approach the problem of navigating the XAI landscape from a practical perspective of emerging regulatory needs. Particularly, the recently approved AI Act gives users of AI applications classified as “high-risk” the right to explanation. We propose a practical framework to navigate between these high-risk domains and the diverse perspectives of different explainees' roles via six core XAI desiderata. The introduced desiderata can then be used by stakeholders with different backgrounds to make informed decisions about which explainability technique is more appropriate for their use case. By supporting context-sensitive assessment of explanation techniques, our framework contributes to the development of more trustworthy AI systems.

Read Paper