ICAART 2025 Abstracts


Area 1 - Artificial Intelligence

Full Papers
Paper Nr: 17
Title:

Enhancing Appearance-Based Gaze Estimation Through Attention-Based Convolutional Neural Networks

Authors:

Rawdha Karmi, Ines Rahmany and Nawres Khlifa

Abstract: Appearance-based gaze estimation is crucial for applications like assistive technology and human-computer interaction, but high accuracy is challenging due to complex gaze patterns and individual appearance variations. This paper proposes an Attention-Enhanced Convolutional Neural Network (AE-CNN) to address these challenges. By integrating attention submodules, AE-CNN improves feature extraction by focusing on the most relevant regions of input data. We evaluate AE-CNN using the ColumbiaGaze dataset and show that it surpasses previous methods, achieving a remarkable accuracy of 99.98%. This work advances gaze estimation by leveraging attention mechanisms to improve performance.
Download

Paper Nr: 27
Title:

Socially-Guided Machine Learning for Self-Organised Community Empowerment

Authors:

Asimina Mertzani and Jeremy Pitt

Abstract: Two key features of self-organising socio-technical systems are, firstly, the interaction of humans with AI,and secondly, the collective determination of social arrangements. However, this presents the risk of an inequitable distribution of power: either by translating or reinforcing existing power asymmetries directly into digital systems, unintended concession of power by human to computational, or arrogation of power by using AI as a proxy. In this paper, based on a definition of empowerment, we implement a socially-guided machine-learning system which integrates multi-agent system, generative AI and user-centred visualisation. The system is evaluated through proof-of-concept demonstrations showing how it could assist users in understanding the impact of social arrangements and so empower communities with choice, control and innovation. The significance of this work is to show how, through the synergy of human expertise, generative AI and (multi-)agent-based simulations, it might be possible to enhance human creativity to imagine original social arrangements, visualise their impact on community empowerment, and maintain an equitable distribution of power.
Download

Paper Nr: 37
Title:

Robust Autocorrelation for Period Detection in Time Series

Authors:

Zhi Yang, Likun Hou and Xing Zhao

Abstract: Autocorrelation is a key tool in time series period detection, but its sensitivity to outliers is a significant limitation. This paper introduces a robust autocorrelation method for period detection that minimizes the influence of outliers. By incorporating a moving average and applying a Median Absolute Deviation (MAD) filter to each cycle-subseries, we significantly enhance the robustness of the autocorrelation to outliers. The MAD filter identifies and corrects outliers in the cycle-subseries, based on the assumption that the cycle-subseries consists of a constant plus Gaussian noise. This innovative robust autocorrelation can effectively replace traditional autocorrelation in existing period detection algorithms. Additionally, we propose a new algorithm that leverages our robust autocorrelation. Both theoretical analysis and empirical tests on real-world and synthetic datasets indicate that period detection algorithms using our proposed robust autocorrelation outperform those using traditional autocorrelation. Furthermore, our proposed algorithm surpasses all other existing algorithms in comparison.
Download

Paper Nr: 43
Title:

BL-MVC: Blockchain Enabled Majority Voting Classifier for Predicting Heart Diseases

Authors:

Deepa Kumari, Akshat Kumar K., Ashutosh Wagh, S. Shashank, Abhishek Patidar and Subhrakanta Panda

Abstract: This paper introduces an innovative framework merging Block-chain and a Majority Voting Classifier (MVC) for heart disease detection, aiming to enhance security and accuracy in managing Electronic Health Records (EHR). The proposed system leverages Blockchain’s distributed ledger and smart contract capabilities to create a secure, tamper-resistant repository for heart-related patient data. The architecture comprises a user-friendly React-based front-end and a FastAPI-powered back-end, interfacing with a local blockchain like Ganache. Solidity smart contracts ensure transparent and secure storage of patient responses, which the framework analyzes through various machine learning models, including hyper-tuned LR, MLP, AdaBoost, CatBoost, and XGBoost. The proposed approach ensembles the prediction using MVC and achieves diagnostic accuracy up to 90%. This paper also compares machine learning models’ performance using evaluation metrics such as accuracy, sensitivity, specificity, precision, F1-measure, Matthew correlation coefficient (MCC), and ROC curve. This integrated framework can empower physicians to diagnose heart disease patients while safeguarding sensitive health data accurately.
Download

Paper Nr: 46
Title:

Cross-Domain Generalization with Reverse Dynamics Models in Offline Model-Based Reinforcement Learning

Authors:

Yana Stoyanova and Maryam Tavakol

Abstract: Recent advancements in offline reinforcement learning (RL) have enabled automation in many real-world applications, where online interactions are often infeasible or costly, especially in high-stakes problems like healthcare or robotics. However, most algorithms are developed and evaluated in the same environment, which does not reflect the ever-charging nature of our world. Hence, beyond dealing with the distributional shift between the learning policy and offline data, it is crucial to account for domain shifts. Model-based offline RL (MBORL) methods are generally preferred over model-free counterparts for their ability to generalize beyond the dataset by learning (forward) dynamics models to generate new trajectories. Nevertheless, these models tend to overgeneralize in out-of-support regions due to limited samples. In this paper, we present a safer approach to balance conservatism and generalization by learning a reverse dynamics model instead, that can adapt to environments with varying dynamics, known as cross-domain generalization. We introduce CARI (Context-Aware Reverse Imaginations), a novel approach that incorporates context-awareness to capture domain-specific characteristics into the reverse dynamics model, resulting in more accurate models. Experiments on four variants of Hopper and Walker2D demonstrate that CARI consistently matches or outperforms state-of-the-art MBORL techniques that utilize a reverse dynamics model for cross-domain generalization.
Download

Paper Nr: 51
Title:

Enhancing LULC Classification with Attention-Based Fusion of Handcrafted and Deep Features

Authors:

Vian Abdulmajeed Ahmed, Khaled Jouini and Ouajdi Korbaa

Abstract: Satellite imagery provides a unique perspective of the Earth’s surface, pivotal for applications like environmental monitoring and urban planning. Despite significant advancements, analyzing satellite imagery remains challenging due to complex and variable land cover patterns. Traditional handcrafted descriptors like Scale-Invariant Feature Transform (SIFT) excel at capturing local features but often fail to capture the global context. Conversely, Convolutional Neural Networks (CNNs) excel at capturing rich contextual information but may miss crucial local features due to limitations in capturing small and subtle spatial arrangements. Most existing Land Use and Land Cover (LULC) classification approaches heavily rely on fine-tuning large pretrained models. While this remains a powerful tool, this paper explores alternative strategies by leveraging the complementary strengths of handcrafted and CNN-learned features. Specifically, we investigate and compare three fusion strategies: (i) early fusion, where handcrafted and CNN-learned features are merged at the input level; (ii) late fusion, where attention mechanisms dynamically integrate salient features from both CNN and SIFT modalities; and (iii) mid-level fusion, where attention is used to generate two feature maps: one prioritizing global context and another, weighted by SIFT features, emphasizing local details. Experiments on the real-world EuroSAT dataset demonstrate that these fusion approaches exhibit varying levels of effectiveness and that a well-chosen fusion strategy not only substantially outperforms the underlying methods used separately but also offers an interesting alternative to solely relying on fine-tuning pre-trained large models.
Download

Paper Nr: 66
Title:

Using Machine Learning to Distinguish Human-Written from Machine-Generated Creative Fiction

Authors:

Andrea Cristina McGlinchey and Peter J. Barclay

Abstract: Following the universal availability of generative AI systems with the release of ChatGPT, automatic detection of deceptive text created by Large Language Models has focused on domains such as academic plagiarism and “fake news”. However, generative AI also poses a threat to the livelihood of creative writers, and perhaps to literary culture in general, through reduction in quality of published material. Training a Large Language Model on writers’ output to generate “sham books” in a particular style seems to constitute a new form of plagiarism. This problem has been little researched. In this study, we trained Machine Learning classifier models to distinguish short samples of human-written from machine-generated creative fiction, focusing on classic detective novels. Our results show that a Na¨ıve Bayes and a Multi-Layer Perceptron classifier achieved a high degree of success (accuracy > 95%), significantly outperforming human judges (accuracy < 55%). This approach worked well with short text samples (around 100 words), which previous research has shown to be difficult to classify. We have deployed an online proof-of-concept classifier tool, AI Detective, as a first step towards developing lightweight and reliable applications for use by editors and publishers, with the aim of protecting the economic and cultural contribution of human authors.
Download

Paper Nr: 67
Title:

An A-Star Algorithm for Argumentative Rule Extraction

Authors:

Benoît Alcaraz, Adam Kaliski and Christopher Leturc

Abstract: In this paper, we present an approach for inferring logical rules in the form of formal argumentation frameworks using the A∗algorithm. We show that contextual argumentation frameworks — in which arguments are activated and deactivated based on the values of the boolean variables that the arguments represent — allow for a concise, graphical, and hence explainable representation of logical rules. We define the proposed approach as a tool to understand the behaviour of already deployed black-box agents. Additionally, we show several applications where having an argumentation framework representing an agent decision’s model is required or could be beneficial. We then apply our algorithm to several datasets in order to evaluate its performances. The algorithm reaches high accuracy scores on discrete datasets, indicating that our approach could be a promising avenue for alternative data-driven AI learning techniques, especially in the context of explainable AI.
Download

Paper Nr: 69
Title:

Fruit-HSNet: A Machine Learning Approach for Hyperspectral Image-Based Fruit Ripeness Prediction

Authors:

Ahmed Baha Ben Jmaa, Faten Chaieb and Anna Fabijańska

Abstract: Fruit ripeness prediction (FRP) is a classification-based agricultural computer vision task that has attracted much attention, thanks to its wide-ranging advantages in agriculture field for both pre-harvest and post-harvest management. Accurate and timely FRP can be achieved using machine/deep learning-based hyperspectral image classification techniques. However, challenges including the limited availability of labeled data and the lack of robust methods generalizable to various hyperspectral cameras and fruit types can compromise the effectiveness of hyperspectral image-based FRP. Addressing these challenges, this paper introduces Fruit-HSNet, a machine learning architecture specifically designed for hyperspectral classification of fruit ripeness. Fruit-HSNet incorporates a spatio-spectral feature extraction module based on Fourier Transform and central pixel spectral signature followed by learnable feature fusion and a classifier optimized for ripeness classification. The proposed architecture was evaluated using the DeepHS Fruit dataset, the largest publicly available labeled real-world hyperspectral dataset for predicting fruit ripeness, which includes five different types of fruits—avocado, kiwi, mango, kaki, and papaya—captured with three distinct hyperspectral cameras at various stages of ripeness. Experimental results highlight that Fruit-HSNet substantially outperforms existing deep learning methods, from baseline to state-of-the-art models, with improvements of 12%, achieving a new state-of-the-art overall accuracy of 70.73%.
Download

Paper Nr: 70
Title:

Leveraging Embedding Vectors of Aggregate Images for Particle Size Distribution Estimation and Concrete Compressive Strength Prediction

Authors:

Samuel Fringeli, Houda Chabbi Drissi, Killian Ruffieux, Julien Ston and Daia Zwicky

Abstract: Accurate prediction of concrete properties, such as compressive strength, is essential for ensuring structural performance. Particle size distribution (PSD) and nature of aggregates are key components of concrete mixtures, significantly influencing their final compressive strength. This paper presents a novel approach that leverages embedding vectors extracted from images of aggregates using the DinoV2 model to efficiently predict compressive strength. DinoV2 is a state-of-the-art vision transformer that excels at generating high-quality embeddings for various visual tasks. In this study, the effectiveness of these embeddings is evaluated by using them to classify and estimate the PSD of aggregates on public datasets. Small neural models trained on these vectors achieved comparable accuracy to the best found fine-tuned ViT-16 model, demonstrating the potential of using embedding vectors for accurate PSD prediction. Building on these results, a new approach for predicting concrete compressive strength by combining embedding vectors with data on concrete mix components is explored. A small dataset of concrete mixtures was created. To mitigate the challenges of limited data, augmentation techniques were proposed to generate additional, realistic mix designs. An ablation study was performed, indicating promising results and highlighting the potential of this new approach for predicting other concrete properties.
Download

Paper Nr: 73
Title:

An Efficient Method for Assessing the Strength of Mahjong Programs

Authors:

Shih-Chieh Tang, Jr-Chang Chen and I-Chen Wu

Abstract: Mahjong, a tile-based game, is a complex four-player stochastic game of imperfect information involving both strategy and luck. Due to its inherent randomness, accurately assessing the strength of players requires a large number of games, which is time-consuming. This randomness primarily originates from two factors: (1) the initial arrangement of the wall and (2) tile stealing by players. Both affect the tiles players draw and thus influence game outcomes. To address the effect of these factors, especially the randomness introduced by stealing, we propose a novel method, called the stable draw wall (abbr. SDW). The SDW partitions the original wall into individual sub-walls for each player, ensuring that the tile drawing order of each player remains consistent and does not change by stealing from any player. The experimental results showed that when playing a small number of games, the win rate of a player by using the SDW is more accurate than by using the original wall. Consequently, our proposed method significantly mitigates the randomness effect caused by changing the order of draws, allowing a more reliable evaluation of the strength of players, which should focus on strategic decision making.
Download

Paper Nr: 79
Title:

Beyond Discrete Environments: Benchmarking Regret-Based Automatic Curriculum Learning in MuJoCo

Authors:

Chin-Jui Chang, Chen-Xing Li, Jan Seyler and Shahram Eivazi

Abstract: Training robust reinforcement learning (RL) agents capable of performing well in unseen scenarios remains a significant challenge. Curriculum learning has emerged as a promising approach to build transferable skills and enhance overall robustness. This paper investigates regret-based adversarial methods for automatically generating curricula, extending their evaluation beyond simple environments to the more complex MuJoCo suite. We benchmark several state-of-the-art regret-based methods against traditional baselines, revealing that while these methods generally outperform baselines, the performance gains are less substantial than anticipated in these more complex environments. Moreover, our study provides valuable insights into the application of regret-based curriculum learning methods to continuous parameter spaces and highlights the challenges involved. We discuss promising directions for improvement and offer perspectives on how current automatic curriculum learning techniques can be applied to real-world tasks.
Download

Paper Nr: 84
Title:

A Hybrid Approach for Assessing Research Text Clarity by Combining Semantic and Quantitative Measures

Authors:

Pranit Prasant Pai, Kaashika Agrawal, Anushri Anil, Archit Saigal and Arti Arya

Abstract: The concept of clarity of text is one that can be quite subjective in nature. This work aims to evaluate the clarity of published research in terms of two key components - Semantic Clarity and Quantitative Clarity. Semantic clarity aims to assess how effectively the meaning of the text is structured, articulated, and conveyed to the reader, and quantitative clarity employs a combination of previously defined formulations and metrics to provide measurable insights into the clarity of the text. Semantic Clarity, predicted using a BERT Model achieved a final validation Mean Squared Error of 0.0169, while the Quantitative Clarity, predicted using a DistilBERT Model, achieved a training loss of 6.9776 and validation loss of 3.6322. By integrating these two dimensions, this study seeks to enhance the overall evaluation process and contribute to a more nuanced understanding of research quality.
Download

Paper Nr: 93
Title:

BevGAN: Generative Fisheye Cross-View Transformers

Authors:

Rania Benaissa, Antonyo Musabini, Rachid Benmokhtar, Manikandan Bakthavatchalam and Xavier Perrotton

Abstract: Current parking assistance and monitoring systems synthesize Bird Eye View (BEV) images to improve drivers visibility. These BEV images are created using a popular perspective transform called Inverse Perspective Mapping (IPM), which projects pixels of surround-view images captured by onboard fisheye cameras onto a flat plane. However, IPM faces challenges in accurately representing objects with varying heights and seamlessly stitching together the projected surround-views due to its reliance on rigid geometric transformations. To address these limitations, we present BevGAN, a novel geometry-guided Conditional Generative Adversarial Networks (cGANs) model that combines multi-scale discriminators along with a transformers-based generator that leverages fisheye cameras calibration and attention-mechanisms to implicitly model geometrical transformations between the views. Experimental results demonstrate that BevGAN outperforms IPM and state-of-the-art cross-view image generation methods in terms of image fidelity and quality. Compared to IPM, we report an improvement of +6.2db on PSNR and +170% on MS-SSIM when evaluated on a synthetic dataset depicting both parking and driving scenarios. Furthermore, the generalization ability of BevGAN on real-world fisheye images is also demonstrated through zero-shot inference.
Download

Paper Nr: 98
Title:

Sequential Counter Encoding for Staircase At-Most-One Constraints

Authors:

Hieu Xuan Truong, Tuyen Van Kieu and Khanh Van To

Abstract: This paper presents a new SAT encoding to represent Staircase At-Most-One (SCAMO) constraints by combining similar sub-formulae between At-Most-One (AMO) constraints within constructing blocks. The SCAMO constraints exhibit a staircase shape due to the structural similarity between consecutive AMO constraints. The proposed method utilizes Sequential Counter (SC) encoding to represent each block in a staircase form, taking advantage of connecting the constraint representation for two consecutive blocks. Compared to the existing SCAMO representation based on Binary Decision Diagrams (BDD), our method requires fewer variables and clauses, resulting in improved solving time for SCAMO. Experimental results on real-world problems, such as Anti-bandwidth problems, demonstrate that the SC encoding representation method for SCAMO consistently outperforms alternative methods.
Download

Paper Nr: 115
Title:

Centralised Urban Traffic Routing Using Mixed-Integer Programming

Authors:

Andrii Nyporko, Matyáš Švadlenka, Nikolai Antonov, Mohammad Rohaninejad and Lukáš Chrpa

Abstract: The increase in the urban population over the past decades led to an increase in the number of vehicles in urban road networks, especially in larger metropolitan areas. The problem is exacerbated during rush hours and when an unexpected or rare event occurs (e.g. accidents, concerts). Existing traffic routing methods, including those embedded in modern navigation systems, consider Dynamic User Optimal (DUO) traffic routing that generates routes in a decentralised fashion. Centralized traffic routing, which we consider in this paper, benefits from the global perspective of the situation that can utilise the road network more effectively. We propose a technique leveraging Mixed-Integer Programming (MIP) for distributing vehicles in the road network while minimizing traffic intensity on road segments. Our evaluation shows the potential of the proposed technique for centralized traffic routing.
Download

Paper Nr: 127
Title:

Computing Improved Explanations for Random Forests: k-Majoritary Reasons

Authors:

Louenas Bounia and Insaf Setitra

Abstract: This work focuses on improving explanations for random forests, which, although efficient and providing reliable predictions through the combination of multiple decision trees, are less interpretable than individual decision trees. To improve their interpretability, we introduce k-majoritary reasons, which are minimal implicants for inclusion supporting the decisions of at least k trees, where k is greater than or equal to the majority of the trees in the forest. These reasons are robust and provide a better explanation of the forest’s decision. However, due to their large size and our cognitive limitations, they may be too hard to interpret. To overcome this obstacle, we propose probabilistic majoritary explanations, which provide a more concise interpretation while maintaining a strict majority of trees. We identify the computational complexity of these explanations and propose algorithms to generate them. Our experiments demonstrate the effectiveness of these algorithms and the improvement in interpretability in terms of size provided by probabilistic majoritary explanations (δprobable majoritary reasons).
Download

Paper Nr: 128
Title:

Computing an Approximating Version of a Minimum-Size Explanation for Boolean Decision Tree Classifiers

Authors:

Louenas Bounia

Abstract: In this work, we tackle the problem of approximating minimum-size explanations for decision trees using a greedy algorithm with guarantees. Calculating a minimum-size abductive explanation can be a time-consuming task due to several factors. First, the combinatorial explosion of possible abductive explanations makes finding the minimum-size explanation extremely costly, even for restricted classifier families like decision trees. Indeed, finding a minimum-size abductive explanation for decision trees is an NP-hard problem, meaning that exact approaches can be very time-consuming, particularly for hard instances and high-dimensional inputs. This adds additional complexity and time to the process of finding the minimum-size explanation. Faced with these complexity challenges, approximate or heuristic approaches are often used to reduce computational load and obtain results more quickly, even if this comes at the cost of solution optimality. In this work, we propose a greedy algorithm to efficiently approximate minimum-size abductive explanations. Based on various experiments aimed at explaining decision tree predictions, we show that for difficult-to-explain instances, our greedy algorithm provides an effective alternative to exact approaches based on SAT encodings.

Paper Nr: 129
Title:

Equivariant and SE(2)-Invariant Neural Network Leveraging Fourier-Based Descriptors for 2D Image Classification

Authors:

Emna Ghorbel, Achraf Ghorbel and Faouzi Ghorbel

Abstract: This paper introduces a novel deep learning framework for 2D shape classification that emphasizes equivariance and invariance through Generalized Finite Fourier-based Descriptors (GFID). Instead of relying on raw images, we extract contours from 2D shapes and compute equivariant, invariant, and stable descriptors, which represent shapes as column vectors in complex space. This approach achieves invariance to parameterization and rigid transformations, while reducing the number of network parameters. We evaluate the proposed lightweight neural network framework by testing it against a simple CNN and a pre-trained InceptionV3, first using the original test set and then with rotated and translated images from well-known benchmarks. Experimental results demonstrate the effectiveness of our method under rigid transformations, showcasing the benefits of Fourier-based invariants for robust classification.
Download

Paper Nr: 132
Title:

Curiosity Driven Reinforcement Learning for Job Shop Scheduling

Authors:

Alexander Nasuta, Marco Kemmerling, Hans Zhou, Anas Abdelrazeq and Robert H. Schmitt

Abstract: The Job Shop Problem (JSP) is a well-known NP-hard problem with numerous applications in manufacturing and other fields. Efficient scheduling is critical for producing customized products in the manufacturing industry in time. Typically, the quality metrics of a schedule, such as the makespan, can only be assessed after all tasks have been assigned, leading to sparse reward signals when framing JSP as a reinforcement learning (RL) problem. Sparse rewards pose significant challenges for many RL algorithms, often resulting in slow learning behavior. Curiosity algorithms, which introduce intrinsic reward signals, have been shown to acceler-ate learning in environments with sparse rewards. In this study, we explored the effectiveness of the Intrinsic Curiosity Module (ICM) and Episodic Curiosity (EC) by benchmarking them against state-of-the-art methods. Our experiments demonstrate that the use of curiosity significantly increases the amount of states encountered by the RL agent. When the intrinsic and extrinsic reward signals are of comparable magnitude, the agent is with ICM module are able to escape local optima and discover better solutions.
Download

Paper Nr: 142
Title:

Powerful & Generalizable, Why not both? VA: Various Attacks Framework for Robust Adversarial Training

Authors:

Samer Khamaiseh, Deirdre Jost, Abdullah Al-Alaj and Ahmed Aleroud

Abstract: Due to its effectiveness, adversarial training (AT) is becoming the first choice to improve the robustness of deep learning models against adversarial attacks. AT is formulated as a min-max optimization problem. The performance of AT is essentially reliant on the inner optimization problem (i.e., max optimization), which re-quires the generation of adversarial examples. Most AT methods rely on a single attack to craft these examples neglecting the impact of image-class robustness on the adversarial training. This oversight led to shortcomings such as poor generalization on both perturbed and clean data, unreliable robustness against unseen adversarial attacks, and limited exploration of the perturbation space. Therefore, an investigation and analysis of AT robustness via adapting various attacks based on image-class robustness is still unaddressed. In this paper, we propose Various Attacks (VA), a novel framework for a robust and generalizable adversarial training based on image-class robustness. Our framework introduces two novel components: Advanced Curriculum Training (ACT), which ensures the diversity of adversarial attacks by gradually increasing attack strength while rotating through these attacks, and Class-Attack Assignment (CAA), which adaptively determines and assigns the optimal adversarial attack to each image-class to maximize the loss. The proposed framework trains image classification neural networks using a variety of adversarial attacks that significantly improve the generalization robustness. The results of experiments on two benchmark datasets show the superiority of the VA framework over state-of-the-art adversarial training methods.
Download

Paper Nr: 145
Title:

Iterative Environment Design for Deep Reinforcement Learning Based on Goal-Oriented Specification

Authors:

Simon Schwan and Sabine Glesner

Abstract: Deep reinforcement learning solves complex control problems but is often challenging to apply in practice for non-experts. Goal-oriented specification allows to define abstract goals in a tree and thereby, aims at lowering the entry barriers to RL. However, finding an effective specification and translating it to an RL environment is still difficult. We address this challenge with our idea of iterative environment design and automate the construction of environments from goal trees. We validate our method based on four established case studies and our results show that learning goals by iteratively refining specifications is feasible. In this way, we counteract the common trial-and-error practice in the development to accelerate the use of RL in real-world applications.
Download

Paper Nr: 146
Title:

HPE-DARTS: Hybrid Pruning and Proxy Evaluation in Differentiable Architecture Search

Authors:

Hung-I. Lin, Lin-Jing Kuo and Sheng-De Wang

Abstract: Neural architecture search (NAS) has emerged as a powerful methodology for automating deep neural network design, yet its high computational cost limits practical applications. We introduce Hybrid Pruning and Proxy Evaluation in Differentiable Architecture Search (HPE-DARTS), integrating soft and hard pruning with a proxy evaluation strategy to enhance efficiency. A warm-up phase stabilizes network parameters, soft pruning via NetPerfProxy accelerates iteration, and hard pruning eliminates less valuable operations to refine the search space. Experiments demonstrate HPE-DARTS reduces search time and achieves competitive accuracy, addressing the reliance on costly validation. This scalable approach offers a practical solution for resource-constrained NAS applications.
Download

Paper Nr: 147
Title:

GWNet: A Lightweight Model for Low-Light Image Enhancement Using Gamma Correction and Wavelet Transform

Authors:

Ming-Yu Kuo and Sheng-De Wang

Abstract: Low-light image enhancement is essential for improving visual quality in various applications. We introduce GammaWaveletNet (GWNet), a novel approach that is composed of a gamma correction module and a wavelet network. The wavelet network is a sequential model with L subnetwork and H subnetwork. Both subnetworks use a U-Net architecture with Spatial Wavelet Interaction (SWI) component that is making use of wavelet transforms and convolution layers. The L subnetwork handles low-frequency components, while the H subnetwork refines high-frequency details, effectively combining spatial and frequency domain information for superior performance. Experimental results across datasets of different sizes demonstrate that GWNet achieves performance on par with state-of-the-art methods in terms of Peak Signal-to-Noise Ratio and Structural Similarity Index. Notably, the incorporation of wavelet transforms in GWNet leads to remarkable computational efficiency, reducing GFLOPs by approximately 75% and parameters by 40%, highlighting its potential for real-time applications on resource-constrained devices.
Download

Paper Nr: 151
Title:

A Two-Phase Safe Reinforcement Learning Framework for Finding the Safe Policy Space

Authors:

A. J. Westley and Gavin Rens

Abstract: As reinforcement learning (RL) expands into safety-critical domains, ensuring agent adherence to safety constraints becomes crucial. This paper introduces a two-phase approach to safe RL, Violation-Guided Identification of Safety(ViGIS), which firstidentifies a safe policy space and then performs standard RL within this space. We present two variants: ViGIS-P, which precalculates the safe policy space given a known transition function, and ViGIS-L, which learns the safe policy space through exploration. We evaluate ViGIS in three environments: a multi-constraint taxi world, a deterministic bank robber game, and a continuous cart-pole problem. Results show that both variants significantly reduce constraint violations compared to standard and β-pessimistic Q-learning, sometimes at the cost of achieving a lower average reward. ViGIS-L consistently outperforms ViGIS-P in the taxi world, especially as constraints increase. In the bank robber environment, both achieve perfect safety. A Deep Q-Network (DQN) implementation of ViGIS-L in the cart-pole domain reduces violations compared to a standard DQN. This research contributes to safe RL by providing a flexible framework for incorporating safety constraints into the RL process. The two-phase approach allows for clear separation between safety consideration and task optimization, potentially easing application in various safety-critical domains.
Download

Paper Nr: 156
Title:

SAT: Segment and Track Anything for Microscopy

Authors:

Nabeel Khalid, Mohammadmahdi Koochali, Khola Naseem, Maria Caroprese, Gillian Lovell, Daniel A. Porto, Johan Trygg, Andreas Dengel and Sheraz Ahmed

Abstract: Integrating cell segmentation with tracking is critical for achieving a detailed and dynamic understanding of cellular behavior. This integration facilitates the study and quantification of cell morphology, movement, and interactions, offering valuable insights into a wide range of biological processes and diseases. However, traditional methods rely on labor-intensive and costly annotations, such as full segmentation masks or bounding boxes for each cell. To address this limitation, we present SAT: Segment and Track Anything for Microscopy, a novel pipeline that leverages point annotations in the first frame to automate cell segmentation and tracking across all subsequent frames. By significantly reducing annotation time and effort, SAT enables efficient and scalable analysis, making it well-suited for large-scale studies. The pipeline was evaluated on two diverse datasets, achieving over 80% Multiple Object Tracking Accuracy (MOTA), demonstrating its robustness and effectiveness across various imaging modalities and cell types. These results highlight SAT’s potential to streamline biomedical research and enable deeper exploration of cellular behavior.
Download

Paper Nr: 159
Title:

Dynamic Graph Representation with Contrastive Learning for Financial Market Prediction: Integrating Temporal Evolution and Static Relations

Authors:

Yunhua Pei, Jin Zheng and John Cartlidge

Abstract: Temporal Graph Learning (TGL) is crucial for capturing the evolving nature of stock markets. Traditional methods often ignore the interplay between dynamic temporal changes and static relational structures between stocks. To address this issue, we propose the Dynamic Graph Representation with Contrastive Learning (DGRCL) framework, which integrates dynamic and static graph relations to improve the accuracy of stock trend prediction. Our framework introduces two key components: the Embedding Enhancement (EE) module and the Contrastive Constrained Training (CCT) module. The EE module focuses on dynamically capturing the temporal evolution of stock data, while the CCT module enforces static constraints based on stock relations, refined within contrastive learning. This dual-relation approach allows for a more comprehensive understanding of stock market dynamics. Our experiments on two major U.S. stock market datasets, NASDAQ and NYSE, demonstrate that DGRCL significantly outperforms state-of-the-art TGL baselines. Ablation studies indicate the importance of both modules. Overall, DGRCL not only enhances prediction ability but also provides a robust framework for integrating temporal and relational data in dynamic graphs. Code and data are available for public access.
Download

Paper Nr: 160
Title:

Approximate Probabilistic Inference for Time-Series Data: A Robust Latent Gaussian Model with Temporal Awareness

Authors:

Anton Johansson and Arunselvan Ramaswamy

Abstract: The development of robust generative models for highly varied non-stationary time-series data is a complex and important problem. Traditional models for time-series data prediction, such as Long Short-Term Memory (LSTM), are inefficient and generalize poorly as they cannot capture complex temporal relationships. In this paper, we present a probabilistic generative model that can be trained to capture complex temporal information, and that is robust to data errors. We call it Time Deep Latent Gaussian Model (tDLGM). Its novel architecture is an extension of the popular Deep Latent Gaussian Model (DLGM). Our model is trained to minimize a novel regularized version of the free energy loss function (an upper bound for the negative log loss). Our regularizer, which accounts for data trends, facilitates robustness to data errors that arise from additive noise. Experiments conducted show that tDLGM is able to reconstruct and generate complex time-series data. Further, the prediction error does not increase in the presence of additive Gaussian noise.
Download

Paper Nr: 162
Title:

A Framework for Developing Robust Machine Learning Models in Harsh Environments: A Review of CNN Design Choices

Authors:

William Dennis and James Pope

Abstract: Machine Learning algorithms are envisioned to be used in harsh and/or safety critical environments such as self-driving cars, aerospace, and nuclear sites where the effects of radiation can cause errors in electronics known as Single Event Effects (SEEs). The effect of SEEs on machine learning models, such as neural networks composed of millions of parameters, is currently unknown. Understanding the models in terms of robustness and reliability is essential for their use in these environments. To facilitate this understanding, we propose a novel framework to simulate SEEs during model training and inference. Using the framework we investigate the robustness of the Convolutional Neural Network (CNN) architecture with dropout, regularisa-tion and activation functions under different error models. Two new activation functions are suggested that decrease error by up to 40% compared to ReLU. We also investigate an alternative pooling layer that can provide model robustness with a 16% decrease in error with ReLU. Overall, our results confirm the efficacy of the framework for evaluating model robustness in harsh environments.
Download

Paper Nr: 165
Title:

ABBIE: Attention-Based BI-Encoders for Predicting Where to Split Compound Sanskrit Words

Authors:

Irfan Ali, Liliana Lo Presti, Igor Spano and Marco La Cascia

Abstract: Sanskrit is a highly composite language, morphologically and phonetically complex. One of the major challenges in processing Sanskrit is the splitting of compound words that are merged phonetically. Recognizing the exact location of splits in a compound word is difficult since several possible splits can be found, but only a few of them are semantically meaningful. This paper proposes a novel deep learning method that uses two bi-encoders and a multi-head attention module to predict the valid split location in Sanskrit compound words. The two bi-encoders process the input sequence in direct and reverse order respectively. The model learns the character-level context in which the splitting occurs by exploiting the correlation between the direct and reverse dynamics of the characters sequence. The results of the proposed model are compared with a state-of-the-art technique that adopts a bidirectional recurrent network to solve the same task. Experimental results show that the proposed model correctly identifies where the compound word should be split into its components in 89.27% of cases, outperforming the state-of-the-art technique. The paper also proposes a dataset developed from the repository of the Digital Corpus of Sanskrit (DCS) and the University of Hyderabad (UoH) corpus.
Download

Paper Nr: 167
Title:

A Discrete Non-Additive Integral Based Interval-Valued Neural Network for Enhanced Prediction Reliability

Authors:

Yassine Hmidy and Mouna Ben Mabrouk

Abstract: In this paper, we propose to replace the perceptron of classical feedforward neural networks by a new aggregation function. In a recent paper, it has been shown that this new aggregation is a relevant learning model, simple to use, and informative as it outputs an interval whose size is correlated to the prediction error of the model. Unlike a classical neural network whose perceptron are usually composed of a linear aggregation and an activation function, the model we propose here is a mere composition of those aggregation functions. In order to show the relevance of using such a neural network, we rely on experiments comparing its performances with those of a classical neural network.
Download

Paper Nr: 179
Title:

A Mixed Quantization Approach for Data-Free Quantization of LLMs

Authors:

Feng Zhang, Yanbin Liu, Weihua Li, Xiaodan Wang and Quan Bai

Abstract: Large Language Models (LLMs) have demonstrated significant capabilities in intelligent activities such as natural language comprehension, content generation, and knowledge retrieval. However, training and deploying these models require substantial computation resources, setting up a significant barrier for developing AI applications and conducting research. Various model compression techniques have been developed to address the demanding computational resource issue. Nonetheless, there has been limited exploration into high-level quantization strategy to offer better flexibility of balancing the trade-off between memory usage and accuracy. We propose an effective mixed-quantization method named MXQ to bridge this research gap for a better memory-accuracy balance. Specifically, we observe that the weight distributions of LLMs vary considerably from layer to layer, resulting in different tolerances to quantization errors. Motivated by this, we derive a novel quantization optimisation formulation to solve for the layer-wise quantization parameters, while enforcing the overall quantization memory consumption budget into the constraints. The new formulation can be efficiently solved by converting to a mixed integer programming problem. Experiments shows that our method can achieve the 1% accuracy loss goal with additional bit budget or further reduce memory usage on Llama models. This unlocks a wide range of quantization options and simplifies memory-accuracy trade-off.
Download

Paper Nr: 189
Title:

Real-Time Transaction Fraud Detection via Heterogeneous Temporal Graph Neural Network

Authors:

Hang Nguyen and Bac Le

Abstract: As digital transactions grow in prevalence, the threat of fraud has become a critical challenge for businesses and individuals. Fraudsters increasingly employ sophisticated tactics, disguising malicious activities as legitimate behavior, which renders traditional detection methods inadequate. This paper introduces a real-time fraud detection framework leveraging Heterogeneous Temporal Graph Neural Networks (HTGNN) to address these challenges. The proposed approach constructs a heterogeneous temporal graph from transaction data and employs a neural network architecture that integrates spatial, temporal, and semantic information. This allows for a comprehensive representation of transactions, entities, and their dynamic interactions over time. Unlike static approaches, our method captures the temporal evolution of behaviors, ensuring deeper insights into fraudulent patterns. The framework is designed to enhance detection accuracy while maintaining computational efficiency for real-time applications. Through rigorous experimentation and analysis, we expect to demonstrate that the proposed HTGNN framework significantly outperforms existing techniques in identifying fraudulent transactions, ultimately contributing to more robust and effective fraud detection systems.
Download

Paper Nr: 196
Title:

Webcam-Based Pupil Diameter Prediction Benefits from Upscaling

Authors:

Vijul Shah, Brian B. Moser, Ko Watanabe and Andreas Dengel

Abstract: Capturing pupil diameter is essential for assessing psychological and physiological states such as stress levels and cognitive load. However, the low resolution of images in eye datasets often hampers precise measurement. This study evaluates the impact of various upscaling methods, ranging from bicubic interpolation to advanced super-resolution, on pupil diameter predictions. We compare several pre-trained methods, including CodeFormer, GFPGAN, Real-ESRGAN, HAT, and SRResNet. Our findings suggest that pupil diameter prediction models trained on upscaled datasets are highly sensitive to the selected upscaling method and scale. Our results demonstrate that upscaling methods consistently enhance the accuracy of pupil diameter prediction models, highlighting the importance of upscaling in pupilometry. Overall, our work provides valuable insights for selecting upscaling techniques, paving the way for more accurate assessments in psychological and physiological research.
Download

Paper Nr: 205
Title:

The Pros and Cons of Adversarial Robustness

Authors:

Yacine Izza and Joao Marques-Silva

Abstract: Robustness is widely regarded as a fundamental problem in the analysis of machine learning (ML) models. Most often robustness equates with deciding the non-existence of adversarial examples, where adversarial examples denote situations where small changes on some inputs cause a change in the prediction. The perceived importance of ML model robustness explains the continued progress observed for most of the last decade. Whereas robustness is often assessed locally, i.e. given some target point in feature space, robustness can also be defined globally, i.e. where any point in feature space can be considered. The importance of ML model robustness is illustrated for example by the existing competition on neural network (NN) verification (VNN-COMP), which assesses the progress of robustness tools for NNs, but also by efforts towards robustness certification. More recently, robustness tools have also been used for computing rigorous explanations of ML models. Despite the continued advances in robustness, this paper uncovers some limitations with existing definitions of robustness, both global and local, but also with efforts towards robustness certification. The paper also investigates uses of adversarial examples besides those related with robustness.
Download

Paper Nr: 212
Title:

TAL4Tennis: Temporal Action Localization in Tennis Videos Using State Space Models

Authors:

Ahmed Jouini, Mohamed Ali Lajnef, Faten Chaieb-Chakchouk and Alex Loth

Abstract: Temporal action localization is a classic computer vision problem in video understanding with a wide range of applications. In the context of sports videos, it is integrated into most of the current solutions used by coaches, broadcasters and game specialists to assist in performance analysis, strategy development, and enhancing the viewing experience. This work presents an application study on temporal action localization for tennis broadcast videos. We study and evaluate a foundational video understanding model for identifying tennis actions in match footage. We explore its architecture, specifically the state space model, from video input to the prediction of temporal segments and classification labels. Our experiments provide findings and interpretations of the model’s performance on tennis data. We achieved an average mean Average Precision (mAP) of 66.14% over all thresholds on the TenniSet dataset, surpassing the other methods, and 96.16% on our private French Open dataset.
Download

Paper Nr: 214
Title:

YeastFormer: An End-to-End Instance Segmentation Approach for Yeast Cells in Microstructure Environment

Authors:

Khola Naseem, Nabeel Khalid, Lea Bertgen, Johannes M. Herrmann, Andreas Dengel and Sheraz Ahmed

Abstract: Cell segmentation is a crucial task, especially in microstructured environments commonly used in synthetic biology. Segmenting cells in these environments becomes particularly challenging when the cells and the surrounding traps share similar characteristics. While deep learning-based methods have shown success in cell segmentation, limited progress has been made in segmenting yeast cells within such complex environments. Most current approaches rely on traditional machine learning techniques. To address this challenge, the study proposed a transfer-based instance segmentation approach to tackle both cell and trap segmentation in mi-crostructured environments. The attention-based mechanism in the model’s backbone enables a more precise focus on key features, leading to improved segmentation accuracy. The proposed approach outperforms existing state-of-the-art methods, achieving a 5% improvement in terms of Intersection over Union (IoU) for the segmentation of both cells and traps in microscopic images.
Download

Paper Nr: 215
Title:

CNN-Trans: A Two-Branch CNN Transformer Model for Multivariate Time Series Classification

Authors:

Sarra Hassine, Sourour Ammar and Ilef Ben Slima

Abstract: The extensive presence of sensors in multiple domains has led to the generation of enormous amounts of multivariate time series data, presenting significant challenges for efficient classification. Although contemporary artificial intelligence methods show promising performance in addressing such data, they often struggle to capture both long-range dependencies and intricate local patterns within the sequences. This paper introduces CNN-Trans, an innovative deep learning model designed specifically for multivariate time series classification to address the mentioned challenge. CNN-Trans combines the strengths of transformers and convolutional neural networks (CNN). The proposed model uses a parallel strategy with both a transformer encoder and a CNN encoder working simultaneously on the time series data. The transformer captures global relationships through self-attention, while the CNN extracts localized spatial features tailored to each variable. We evaluate CNN-Trans on various benchmark datasets encompassing diverse sensor applications. The results show that our model is robust and highly effective for complex data. CNN-Trans outperforms others with 93.33% on NATOPS and 98.37% on PenDigits, excelling in high-dimensional datasets like Kitchen (95.74%) and HAR (87.41%). Additionally, CNN-Trans exhibits robustness and generalizability across different input features, showcasing its practical utility in real-world scenarios.
Download

Paper Nr: 222
Title:

Markov Process-Based Graph Convolutional Networks for Entity Classification in Knowledge Graphs

Authors:

Johannes Mäkelburg, Yiwen Peng, Mehwish Alam, Tobias Weller and Maribel Acosta

Abstract: Despite the vast amount of information encoded in Knowledge Graphs (KGs), information about the class affiliation of entities remains often incomplete. Graph Convolutional Networks (GCNs) have been shown to be effective predictors of complete information about the class affiliation of entities in KGs. However, these models do not learn the class affiliation of entities in KGs incorporating the complexity of the task, which negatively affects the models’ prediction capabilities. To address this problem, we introduce a Markov process-based architecture into well-known GCN architectures. This end-to-end network learns the prediction of class affiliation of entities in KGs within a Markov process. The number of computational steps is learned during training using a geometric distribution. At the same time, the loss function combines insights from the field of evidential learning. The experiments show a performance improvement over existing models in several studied architectures and datasets. Based on the chosen hyperparameters for the geometric distribution, the expected number of computation steps can be adjusted to improve efficiency and accuracy during training.
Download

Paper Nr: 252
Title:

Knowledge Graph Enrichments for Credit Account Prediction

Authors:

Michael Schulze and Andreas Dengel

Abstract: For the problem of credit account prediction on the basis of received invoices, this paper presents a pipeline consisting of 1) construction of an accounting knowledge graph, 2) enrichment algorithms, and 3), prediction of credit accounts with methods of a) rule-based link prediction, b) case-based reasoning, and c) a combination of both. Explainability and traceability have been key requirements. While preserving the order of invoices in cross-fold validation, key findings in our scenario are: 1) using all enrichments from the pipeline increases prediction performance up to 12.45 percent points, 2) single enrichments are useful on their own, 3) case-based reasoning benefits most from having enrichments available, and 4), the combination of link prediction and case-based reasoning yields best prediction results in our scenario. Paper page: https://git.opendfki.de/michael.schulze/account-prediction.
Download

Paper Nr: 253
Title:

A Deep Learning Approach for Predicting the Response to Anti-VEGF Treatment in Diabetic Macular Edema Patients Using Optical Coherence Tomography Images

Authors:

Karima Garraoui, Ines Rahmany, Salah Dhahri, Hedi Tabia, Desiré Sidibé, Hsouna Zgolli and Nawres Khlifa

Abstract: Diabetic macular edema (DME) is a serious complication of diabetes that can lead to vision loss, making the prediction of patient response to anti-vascular endothelial growth factor (anti-VEGF) treatment crucial for optimizing therapeutic strategies. This study introduces ESSDP (Extended Siam Saves Diabetes Patients), a novel deep learning approach leveraging a Siamese network architecture with EfficientNetB2 to predict therapeutic response in DME patients through optical coherence tomography (OCT) image analysis. By classifying patients into good or poor responder groups based on central macular thickness reduction after injection, the proposed framework achieved a predictive performance with an accuracy of 0.80, sensitivity of 0.71, precision of 0.89, and an F1-Score of 0.74. These findings highlight the potential of Siamese network-based deep learning architectures as effective tools for predicting treatment outcomes in DME patients, even when working with limited datasets, and pave the way for enhancing personalized treatment strategies in ophthalmology.
Download

Paper Nr: 266
Title:

Cellular Automata-Based Model for Simulation of Collective Pedestrian Dynamics in Indoor Environments with Surmountable Obstacles

Authors:

Eduardo C. Silva, Gabriela S. Damazo, Gina M. B. Oliveira and Luiz G. A. Martins

Abstract: Understanding and predicting human behavior in normal and emergency situations is a difficult task that attracts the attention of many researchers. In this sense, modeling and simulation of collective pedestrian dynamics (CPD) is essential in society, as it is used in various scenarios, such as urban planning and public safety. Cellular Automata stand out as simple computational tools capable of identifying and reproducing the complexity of various patterns, such as pedestrian movement, especially during evacuation in emergency situations. Models of this type take several parameters into consideration, such as the strategy for choosing the floor, the interaction between pedestrians, social phenomena, such as panic and the tendency to follow crowds, among others. This work proposes a model based on cellular automata for modeling CPD, strongly based on the Varas Model, which combines three changes to bring the simulation closer to reality. These are: changing the movement dynamics, presenting the separation between surmountable and impassable obstacles, and changing the permission to pass between objects diagonally. These updates speed up the pedestrian evacuation process and increase the level of credibility of the simulations compared to reality.
Download

Paper Nr: 268
Title:

Household Task Planning with Multi-Objects State and Relationship Using Large Language Models Based Preconditions Verification

Authors:

Jin Aoyama, Sudesna Chakraborty, Takeshi Morita, Shusaku Egami, Takanori Ugai and Ken Fukuda

Abstract: We propose a novel approach to household task planning that leverages Large Language Models (LLMs) to comprehend and consider environmental states. Unlike previous methods that depend primarily on commonsense reasoning or visual inputs, our approach focuses on understanding object states and relationships within the environment. To evaluate the capability, we developed a specialized dataset of household tasks that specifically tests LLMs’ ability to reason about object states, identifiers, and relationships. Our method combines simulator-derived environmental state information with an LLM-based planning to generate executable action sequences. A key feature in our system is the LLM-driven verification mechanism that assesses whether environmental preconditions are met before each action executes, automatically reformulating action steps when prerequisites are not satisfied. Experimental results using GPT-4o demonstrate strong performance, achieving 89.4% success rate on state change tasks and 81.6% on placement tasks. Ablation studies confirm the precondition check’s significant contribution to task success. This study establishes both a new methodology for embodied AI reasoning and a benchmark for future work in environment-aware task planning.
Download

Paper Nr: 270
Title:

An Efficient Compilation-Based Approach to Explaining Random Forests Through Decision Trees

Authors:

Alnis Murtovi, Maximilian Schlüter and Bernhard Steffen

Abstract: Tree-based ensemble methods like Random Forests often outperform deep learning models on tabular datasets but suffer from a lack of interpretability due to their complex structures. Existing explainability techniques either offer approximate explanations or face scalability issues with large models. In this paper, we introduce a novel compilation-based approach that transforms Random Forests into single, semantically equivalent decision trees through a recursive process enhanced with optimizations and heuristics. Our empirical evaluation demonstrates that our approach is over an order of magnitude faster than current state-of-the-art compilation-based methods while producing decision trees of comparable size.
Download

Paper Nr: 272
Title:

Industrial Image Grouping Through Pre-Trained CNN Encoder-Based Feature Extraction and Sub-Clustering

Authors:

Selvine G. Mathias, Saara Asif, Muhammad Uzair Akmal, Simon Knollmeyer, Leonid Koval and Daniel Grossmann

Abstract: A common challenge faced by many industries today is the classification of unlabeled image data from production processes into meaningful groups or patterns for better documentation and analysis. This paper presents a sequential approach for leveraging industrial image data to identify patterns in products or processes for plant floor operators. The dataset used is sourced from steel production, and the model architecture integrates feature reduction through convolutional neural networks (CNNs) like VGG, EfficientNet, and ResNet, followed by clustering algorithms to assign appropriate labels to the observed data. The model’s selection criteria combine clustering metrics, including entropy minimization and silhouette score maximization. Once primary clusters are identified, sub-clustering is performed using near-labels, which are pre-assigned to images with initial distinctions. A novel metric, C-Score, is introduced to assess cluster convergence and grouping accuracy. Experimental results demonstrate that this method can address challenges in detecting variations across images, improving pattern recognition and classification.
Download

Paper Nr: 276
Title:

Hierarchically Gated Experts for Efficient Online Continual Learning

Authors:

Kevin Luong and Michael Thielscher

Abstract: Continual Learning models aim to learn a set of tasks under the constraint that the tasks arrive sequentially with no way to access data from previous tasks. The Online Continual Learning framework poses a further challenge where the tasks are unknown and instead the data arrives as a single stream. Building on existing work, we propose a method for identifying these underlying tasks: the Gated Experts (GE) algorithm, where a dynamically growing set of experts allows for new knowledge to be acquired without catastrophic forgetting. Furthermore, we extend GE to Hierarchically Gated Experts (HGE), a method which is able to efficiently select the best expert for each data sample by organising the experts into a hierarchical structure. On standard Continual Learning benchmarks, GE and HGE are able to achieve results comparable with current methods, with HGE doing so more efficiently.
Download

Paper Nr: 277
Title:

Leveraging Large Language Models for Preference-Based Sequence Prediction

Authors:

Michaela Tecson, Daphne Chen, Michelle Zhao, Zackory Erickson and Reid Simmons

Abstract: We present a novel approach to leveraging Large Language Models (LLMs) for action prediction in meal preparation sequences, with a focus on tailoring predictions based on user preferences. We introduce methods using OpenAI’s GPT-4o model to predict subsequent actions in a sequence by providing different forms of context such as sequences from other participants or prior sequences of the test participant. Our approach outperforms baseline methods, including Aggregate Long Short-Term Memory (LSTM) and mixture-of-experts (MoE) models, by up to 33.8% by leveraging the LLM’s ability to adapt predictions based on minimal prior context. We highlight the generalizability of the method across different cooking domains by analyzing the results on two different cooking datasets. This adaptability will be useful for assistive systems aiming to support older adults, especially those with Mild Cognitive Impairments (MCI), in completing complex, sequential tasks in ways that align with the user’s preferences. The full prompts used in this work can be found at the project webpage: sites.google.com/view/preference-based-prediction.
Download

Paper Nr: 283
Title:

CrowdSim++: Unifying Crowd Navigation and Obstacle Avoidance

Authors:

Marco Rosano, Danilo Leocata, Antonino Furnari and Giovanni Maria Farinella

Abstract: In recent years, significant advancements in learning-based technologies have propelled the development of autonomous robotic systems designed to assist humans in challenging scenarios during their daily activities. This research focuses on enhancing robotic perception and control, particularly in navigating complex, crowded environments. Traditional approaches often treat static and dynamic components separately, limiting the robots’ real-world performance. We propose CrowdSim++, an extension of the open-source CrowdSim simulator (Chen et al., 2019), to unify crowd navigation and obstacle avoidance. CrowdSim++ enables training navigation policies in dynamically generated environments or real-world floor plans, using a 2D lidar sensor and a “person sensor” for enhanced perception. Our experiments demonstrate that Reinforcement Learning-based navigation policies trained in complex environments with humans outperform those trained in simpler scenarios. Additionally, providing robots with specialized sensors to accurately distinguish between static and dynamic obstacles is essential for achieving superior performance. To advance research in autonomous navigation, the source code and dataset of realistic floor plans are available at the following link.
Download

Paper Nr: 288
Title:

DiPACE: Diverse, Plausible and Actionable Counterfactual Explanations

Authors:

Jacob Sanderson, Hua Mao and Wai Lok Woo

Abstract: As Artificial Intelligence (AI) becomes integral to high-stakes applications, the need for interpretable and trustworthy decision-making tools is increasingly essential. Counterfactual Explanations (CFX) offer an effective approach, allowing users to explore “what if?” scenarios that highlight actionable changes for achieving more desirable outcomes. Existing CFX methods often prioritize select qualities, such as diversity, plausibility, proximity, or sparsity, but few balance all four in a flexible way. This work introduces DiPACE, a practical CFX framework that balances these qualities while allowing users to adjust parameters according to specific application needs. DiPACE also incorporates a penalty-based adjustment to refine results toward user-defined thresholds. Experimental results on real-world datasets demonstrate that DiPACE consistently outperforms existing methods Wachter, DiCE and CARE in achieving diverse, realistic, and actionable CFs, with strong performance across all four characteristics. The findings confirm DiPACE’s utility as a user-adaptable, interpretable CFX tool suitable for diverse AI applications, with a robust balance of qualities that enhances both feasibility and trustworthiness in decision-making contexts.
Download

Paper Nr: 290
Title:

Refining High-Quality Labels Using Large Language Models to Enhance Node Classification in Graph Echo State Network

Authors:

Ikhlas Bargougui, Rebh Soltani and Hela Ltifi

Abstract: Graph learning has attracted significant attention due to its applicability in various real-world scenarios involving textual data. Recent advancements, such as Graph Echo State Networks (GESN) within the reservoir computing (RC) paradigm, have shown notable success in node-level classification tasks, especially for heterophilic graphs. However, graph neural networks (GNNs) suffer from the need for a large number of high-quality labels to achieve promising performance. Conversely, large language models (LLMs), with their extensive knowledge bases, have demonstrated impressive zero-shot and few-shot learning abilities, particularly for node classification tasks. However, LLMs struggle with efficiently processing structural data and incur high inference costs. In this paper, we introduce a novel pipeline named LLM-GESN, which involves four flexible components: k-means clustering for active node selection, LLM for difficulty aware annotation, adaptable post-selection, and GESN model training and prediction. Experimental results demonstrate the effectiveness of LLM-GESN on text-attributed graphs from the Cora, CiteSeer, Pubmed, Wikics, and ogbn-arxiv datasets. Our LLM-GESN achieved significant test accuracy of 86.67%, 76.63%, 74.58%, 77.09%, and 58.79%, respectively, compared to state-of-the-art methods.
Download

Paper Nr: 307
Title:

A Novel Vision Transformer for Camera-LiDAR Fusion Based Traffic Object Segmentation

Authors:

Toomas Tahves, Junyi Gu, Mauro Bellone and Raivo Sell

Abstract: This paper presents Camera-LiDAR Fusion Transformer (CLFT) models for traffic object segmentation, which leverage the fusion of camera and LiDAR data using vision transformers. Building on the methodology of visual transformers that exploit the self-attention mechanism, we extend segmentation capabilities with additional classification options to a diverse class of objects including cyclists, traffic signs, and pedestrians across diverse weather conditions. Despite good performance, the models face challenges under adverse conditions which underscores the need for further optimization to enhance performance in darkness and rain. In summary, the CLFT models offer a compelling solution for autonomous driving perception, advancing the state-of-the-art in multimodal fusion and object segmentation, with ongoing efforts required to address existing limitations and fully harness their potential in practical deployments.
Download

Paper Nr: 314
Title:

RePAD3: Advanced Lightweight Adaptive Anomaly Detection for Univariate Time Series of Any Pattern

Authors:

Ming-Chang Lee, Jia-Chun Lin and Sokratis Katsikas

Abstract: Univariate time series anomaly detection is crucial for early risk identification and prompt response, making it essential for diverse applications such as energy usage monitoring, temperature monitoring, heart rate monitoring. To be applicable and valuable in the real world, anomaly detection must process time series data on the fly, detect anomalies in real time, and adapt to unexpected pattern changes in an efficient and lightweight manner. Several anomaly detection approaches with such capability have been introduced; however, they often generate frequent false positives. In this paper, we present a lightweight and adaptive anomaly detection approach named RePAD3 by leveraging the strengths of two state-of-the-art methods and mitigating their shortcomings with advanced detection and pattern inspection. According to our extensive experiments with real-world time series datasets, RePAD3 demonstrates superior detection accuracy and lower false positives across various patterns presented in the time series, thereby broadening its real-world applicability.
Download

Paper Nr: 316
Title:

Multi-Agent Trajectory Prediction for Urban Environments with UAV Data Using Enhanced Temporal Kolmogorov-Arnold Networks with Particle Swarm Optimization

Authors:

Mohammad Reza Mohebbi, Elahe Kafash and Mario Döller

Abstract: Accurate trajectory prediction for moving agents such as pedestrians and vehicles is essential for autonomous driving, intelligent navigation, and abnormal behavior detection. Real-time prediction of future movements enhances the development of autonomous vehicles and the efficiency of traffic management systems. In this study, a novel trajectory prediction approach based on Temporal Kolmogorov-Arnold Networks (TKAN) is introduced, using the TUMDOT-MUC dataset collected by Unmanned Aerial Vehicles (UAVs) in Munich, Germany, to model large-scale urban scenarios. To improve prediction accuracy, additional features were extracted from the primary dataset and incorporated into the TKAN architecture, demonstrating a marked performance improvement over general machine learning models. The accuracy of predictions is further refined by tuning hyperparameters of TKAN through Particle Swarm Optimization (PSO). The proposed model provides a robust and reliable solution for the trajectory prediction of multi-agents in challenging urban traffic conditions. This research advances intelligent and effective transportation systems by proposing scalable methods for improved traffic management and safety in densely populated urban areas, ultimately contributing to smarter and more efficient transportation networks.
Download

Paper Nr: 323
Title:

Machine Learning and Deep Learning Approaches for Early Alzheimer’s Detection in Patients with Subjective Cognitive Decline: A Systematic Literature Review

Authors:

Zyad Taouil, Nourhène Ben Rabah and Bénédicte Le Grand

Abstract: This paper investigates the application of machine learning and deep learning techniques for the early detection of Alzheimer’s Disease (AD) in patients with Subjective Cognitive Decline (SCD), a preclinical AD stage. Traditional diagnosis methods struggle to detect AD at this stage, making ML a promising alternative for early intervention. A systematic literature review (SLR) was conducted to identify and analyze the most effective ML models, data types, and preprocessing techniques for early AD detection. This review highlights that Convolutional Neural Network (CNN), Random Forest, and logistic regression models, particularly when applied to multimodal data (e.g., neuroimaging, genetic, and vocal features), showing high diagnosis accuracy. Data preprocessing steps such as feature engineering and data augmentation significantly enhance model performance. This paper also explores the practical implications of implementing ML models in clinical settings and discusses system integration, clinician training, and ethical considerations surrounding patient data. This research emphasizes the potential of ML to enhance early AD diagnosis.
Download

Paper Nr: 337
Title:

Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies

Authors:

Dennis Gross and Helge Spieker

Abstract: Deep reinforcement learning (RL) policies can demonstrate unsafe behaviors and are challenging to interpret. To address these challenges, we combine RL policy model checking—a technique for determining whether RL policies exhibit unsafe behaviors—with co-activation graph analysis—a method that maps neural network inner workings by analyzing neuron activation patterns—to gain insight into the safe RL policy’s sequential decision-making. This combination lets us interpret the RL policy’s inner workings for safe decision-making. We demonstrate its applicability in various experiments.
Download

Paper Nr: 352
Title:

PurGE: Towards Responsible Artificial Intelligence Through Sustainable Hyperparameter Optimization

Authors:

Gauri Vaidya, Meghana Kshirsagar and Conor Ryan

Abstract: Hyperparameter optimization (HPO) plays a crucial role in enhancing the performance of machine learning and deep learning models, as the choice of hyperparameters significantly impacts their accuracy, efficiency, and generalization. Despite its importance, HPO remains a computationally intensive process, particularly for large-scale models and high-dimensional search spaces. This leads to prolonged training times and increased energy consumption, posing challenges in scalability and sustainability. Consequently, there is a pressing demand for efficient HPO methods that deliver high performance while minimizing resource consumption. This article introduces PurGE, an explainable search-space pruning algorithm that leverages Grammatical Evolution to efficiently explore hyperparameter configurations and dynamically prune suboptimal regions of the search space. By identifying and eliminating low-performing areas early in the optimization process, PurGE significantly reduces the number of required trials, thereby accelerating the hyperparameter optimization process. Comprehensive experiments conducted on five benchmark datasets demonstrate that PurGE achieves test accuracies that are competitive with or superior to state-of-the-art methods, including random search, grid search, and Bayesian optimization. Notably, PurGE delivers an average computational speed-up of 47x, reducing the number of trials by 28% to 35%, and achieving significant energy savings, equivalent to approximately 2,384 lbs of CO2e per optimization task. This work highlights the potential of PurGE as a step toward sustain-able and responsible artificial intelligence, enabling efficient resource utilization without compromising model performance or accuracy.
Download

Paper Nr: 359
Title:

Belief Re-Use in Partially Observable Monte Carlo Tree Search

Authors:

Ebert Theeuwes, Gabriele Venturato and Gavin Rens

Abstract: Partially observable Markov decision processes (POMDPs) require agents to make decisions with incomplete information, facing challenges like an exponential growth in belief states and action-observation histories. Monte Carlo tree search (MCTS) is commonly used for this, but it redundantly evaluates identical states reached through different action sequences. We propose Belief Re-use in Online Partially Observable Planning (BROPOP), a technique that transforms the MCTS tree into a graph by merging nodes with similar beliefs. Using a POMDP-specific locality-sensitive hashing method, BROPOP efficiently identifies and reuses belief nodes while preserving information integrity through update-descent backpropagation. Experiments on standard benchmarks show that BROPOP enhances reward performance with controlled computational cost.
Download

Paper Nr: 375
Title:

ASPERA: Exploring Multimodal Action Recognition in Football Through Video, Audio, and Commentary

Authors:

Takane Kumakura, Ryohei Orihara, Yasuyuki Tahara, Akihiko Ohsuga and Yuichi Sei

Abstract: This study proposes ASPERA (Action SPotting thrEe-modal Recognition Architecture), a multimodal football action recognition method based on the ASTRA architecture that incorporates video, audio, and commentary text information. ASPERA showed higher accuracy than models using video and audio only, excluding invisible actions in the video. This result demonstrates the advantage of this multimodal approach. Additionally, we propose three advanced models: ASPERAsrnd incorporating surrounding commentary text within a ±20-second range, ASPERAcln removing irrelevant background information, and ASPERAMC applying a Markov head to provide prior knowledge of football action flow. ASPERAsrnd and ASPERAcln, which refine the text embedding, enhanced the ability to accurately identify the timing of actions. Notably, ASPERAMC with the Markov head demonstrated the highest accuracy for invisible actions in the football video. ASPERAsrnd and ASPERAcln not only demonstrate the utility of text information in football action spotting but also highlight key factors that enhance this effect, such as incorporating surrounding commentary text and removing background information. Finally, ASPERAMC shows the effectiveness of combining Transformer models and Markov chains for recognizing actions in invisible scenes.
Download

Paper Nr: 384
Title:

Investigating Answer Validation Using Noise Identification and Classification in Goal-Oriented Dialogues

Authors:

Sara Mirabi, Bahadorreza Ofoghi, John Yearwood, Diego Molla-Aliod and Vicky Mak-Hau

Abstract: Goal-oriented conversational systems based on large language models (LLMs) provide the potential capability to gather the necessary requirements for solving tasks or developing solutions. However, in real-world scenarios, non-expert users may respond incorrectly to dialogue questions, which can impede the system’s effectiveness in eliciting accurate information. This paper presents a novel approach to detecting and categorizing noisy answers in goal-oriented conversations, with a focus on modeling linear programming problems. Using a current LLM, Gemini, we develop multi-agent synthetic conversations based on problem statements from the benchmark optimization modeling dataset NL4Opt to generate dialogues in the presence of noisy answers too. Our experiments show the LLM is not sufficiently equipped with the capabilities to detect noisy answers and hence, in almost 59% of the cases where there is a noisy answer, the LLM continues with the conversation without any attempts at resolving the noise. Thus, we also propose a two-step answer validation method for the identification and classification of noisy answers. Our findings demonstrate that while some LLM and non-LLM-based models perform well in detecting answer inaccuracies, there is a need for further improvements in classifying noisy answers into fine-grained stress types.
Download

Paper Nr: 385
Title:

Model Characterization with Inductive Orientation Vectors

Authors:

Kerria Pang-Naylor, Eric Chen and George D. Montañez

Abstract: As models rise in complexity, black-box evaluation and interpretation methods become critical. We introduce estimation methods for characterizing model-theoretic quantities such as algorithm flexibility, responsiveness to changes in training data, and ability to specialize. These methods are applicable to any black-box classification algorithm. Past theoretical work has shown how such qualities affect probability of task success, generalization, and tendency to overfit. We perform metric estimations of interpretable models across hyperparameters and corroborate the metrics’ behavior with known algorithm heuristics. This work presents a general model-agnostic interpretability tool.
Download

Paper Nr: 387
Title:

Transfer Learning in Deep Reinforcement Learning: Actor-Critic Model Reuse for Changed State-Action Space

Authors:

Feline Malin Barg, Eric Veith and Lasse Hammer

Abstract: Deep Reinforcement Learning (DRL) is a leading method for control in high-dimensional environments, excelling in complex tasks. However, adapting DRL agents to sudden changes, such as reduced sensors or actuators, poses challenges to learning stability and efficiency. While Transfer Learning (TL) can reduce retraining time, its application in environments with sudden state-action space modifications remains underex-plored. Resilient, time-efficient strategies for adapting DRL agents to structural changes in state-action space dimension are still needed. This paper introduces Actor-Critic Model Reuse (ACMR), a novel TL-based algorithm for tasks with altered state-action spaces. ACMR enables agents to leverage pre-trained models to speed up learning in modified environments, using hidden layer reuse, layer freezing, and network layer expansion. The results show that ACMR significantly reduces adaptation times while maintaining strong performance with changed state-action space dimensions. The study also provides insights into adaptation performance across different ACMR configurations.
Download

Paper Nr: 392
Title:

Heating up Interactions in an Agent-Based Simulation to Ensure Narrative Interest

Authors:

Gonzalo Méndez and Pablo Gervás

Abstract: Multi-agent systems have become important sources of inspiration for narrative generation systems, with significant growth in solutions based on story sifting: identifying the subset of events generated by such a system that is worthy of being told as a story. Existing systems simulate the romantic behaviour of agents based on simple rules that consider models of social norms and relations, and the evolution of affinities between agents. The present paper describes an extension to one such simulation that inserts several sources of conflict between characters to induce more interesting situations that allows the creation of more engaging stories. The system is empirically shown to give rise with much higher scores on metrics for narrative interest.
Download

Paper Nr: 400
Title:

Optimization of Link Configuration for Satellite Communication Using Reinforcement Learning

Authors:

Tobias Rohe, Michael Kölle, Jan Matheis, Rüdiger Höpfl, Leo Sünkel and Claudia Linnhoff-Popien

Abstract: Satellite communication is a key technology in our modern connected world. With increasingly complex hardware, one challenge is to efficiently configure links (connections) on a satellite transponder. Planning an optimal link configuration is extremely complex and depends on many parameters and metrics. The optimal use of the limited resources, bandwidth and power of the transponder is crucial. Such an optimization problem can be approximated using metaheuristic methods such as simulated annealing, but recent research results also show that reinforcement learning can achieve comparable or even better performance in optimization methods. However, there have not yet been any studies on link configuration on satellite transponders. In order to close this research gap, a transponder environment was developed as part of this work. For this environment, the performance of the reinforcement learning algorithm PPO was compared with the metaheuristic simulated annealing in two experiments. The results show that Simulated Annealing delivers better results for this static problem than the PPO algorithm, however, the research in turn also underlines the potential of reinforcement learning for optimization problems.
Download

Paper Nr: 411
Title:

CriX: Intersection of Crime, Demographics and Explainable AI

Authors:

Muhammad Ashar Reza, Aaditya Bisaria, S. Advaitha, Alekhya Ponnekanti and Arti Arya

Abstract: Crime prediction and analysis often rely on crime statistics but neglect the potential influences of demographic factors. Each locality possesses unique characteristics indicating that a ’one-size-fits-all’ methodology is in-adequate. This research presents a framework CriX that incorporates demographic factors to help understand and address localised crime. At the root level, identifying and predicting crime hotspots is essential for providing context in training the language model; therefore, ST-DBSCAN and LSTM models are respectively used on a custom-made dataset. InLegalBERT (Paul et al., 2023), which is pre-trained on Indian legal data, helps generate embeddings for the large corpus of crime hotspot, demographic and legal data. These embeddings are stored in a FAISS vector store, allowing for dynamic retrieval using RAG techniques. The generated embeddings are then fed into MistralAI offering a textual solution. These outputs are further refined using zero shot learning increasing model performance. The proposed framework achieved a validation accuracy of over 82% for crime hotspot predictions. The LLM also showcased substantial scores for Compactness, Fidelity and Completeness, giving an average score of 4.18 out of 5, outperforming baseline models. This approach enhances the interpretability of legal models by incorporating the concepts of Explainable AI (XAI).
Download

Paper Nr: 439
Title:

LMSC-UNet: A Lightweight U-Net with Modified Skip Connections for Semantic Segmentation

Authors:

Shrutika S. Sawant, Andreas Medgyesy, Sahana Raghunandan and Theresa Götz

Abstract: U-Net, an encoder-decoder architecture is the most popular choice in the semantic segmentation field due to its ability to learn rich semantic features while handling enormous amounts of data. However, due to large number of parameters and slow inference, deploying U-Net on devices with limited computational resources such as mobile and embedded devices becomes challenging. To alleviate the above challenge, in this study, we propose an efficient, lightweight, and robust encoder-decoder architecture, LMSC-UNet for semantic segmentation that captures more comprehensive, contextual information and effectively learns rich semantic features. This lightweight architecture considerably reduces the number of trainable parameters, requiring sufficiently less memory space, training, and inference time. Skip connections in original U-Net fuse features from each encoder block to the corresponding decoder block. This simple skip connection reduces the semantic gap to some extent and may limit the segmentation performance. Therefore, we replace the skip connection from the second level of U-Net with a bottleneck residual block (BRB) which helps to enhance the final segmentation map by lessening the semantic gap between the features of decoder with the corresponding features of encoder. Extensive experiments on various segmentation datasets from diverse domains demonstrate the effectiveness of our proposed approach. The experimental results show that the compact model speeds up the inference process, while still maintaining the performance. When compared to the standard U-Net, LMSC-UNet has achieved 7× reduction in Floating Point Operations (FLOPs), and 34× reduction in model size, while maintaining the segmentation accuracy.
Download

Paper Nr: 440
Title:

Enhancing Bilingual Lexicon Induction with Dynamic Translation

Authors:

Michaela Denisová and Pavel Rychlý

Abstract: Bilingual lexicon induction (BLI) has been a popular task for evaluating cross-lingual word embeddings (CWEs). The prevalent metric employed in the evaluation is precision at k, where k represents the number of target words retrieved for each source word. However, establishing a fixed k for the entire evaluation dataset proves challenging due to varying target word counts for each source word. This leads to limited results, compromising either precision or recall. In this paper, we present a novel classification-based approach with dynamic k for bilingual lexicon induction that aims to identify all relevant target words for each source word by exploiting the information derived from the aligned embeddings while offering a balanced trade-off between precision and recall. On top of that, it enables the evaluation of the existing CWEs using dynamic k. Compared to the standard baseline systems and evaluation procedures, it provides competitive results.
Download

Paper Nr: 441
Title:

Graphical Analysis of Abstract Argumentation Frameworks via Boolean Networks

Authors:

Van-Giang Trinh, Belaid Benhamou and Vincent Risch

Abstract: Abstract Argumentation Frameworks (AFs) are the key formalism of abstract argumentation, which is one of the main directions in argumentation research. An AF is mainly studied by means of its extensions, defined as subsets of arguments. In this work, we define a Boolean Network (BN) encoding for AFs, where BNs are a simple and efficient mathematical formalism that has a long history of research. We then show that the attack graph of an AF coincides with the influence graph of its encoded BN, and in particular preferred and stable extensions of this AF one-to-one correspond to minimal trap spaces and fixed points of the encoded BN, respectively. We also define a new concept for BNs called complete trap space, then show that complete trap spaces (resp. the percolation of the special trap space where all variables are free) in BNs one-to-one correspond (resp. corresponds) to complete extensions (resp. the grounded extension) in AFs. This connection opens the promising application to graphical analysis of AFs, which is an interesting line of research with many useful applications. More specifically, we use it to explore many new results relating extensions of an AF and (positive or negative) cycles in its attack graph. In particular, we show new upper bounds based on positive feedback vertex sets for the numbers of stable, preferred, and complete extensions.
Download

Paper Nr: 443
Title:

Advancing Cross-Lingual Aspect-Based Sentiment Analysis with LLMs and Constrained Decoding for Sequence-to-Sequence Models

Authors:

Jakub Šmíd, Pavel Přibáň and Pavel Král

Abstract: Aspect-based sentiment analysis (ABSA) has made significant strides, yet challenges remain for low-resource languages due to the predominant focus on English. Current cross-lingual ABSA studies often centre on simpler tasks and rely heavily on external translation tools. In this paper, we present a novel sequence-to-sequence method for compound ABSA tasks that eliminates the need for such tools. Our approach, which uses constrained decoding, improves cross-lingual ABSA performance by up to 10%. This method broadens the scope of cross-lingual ABSA, enabling it to handle more complex tasks and providing a practical, efficient alternative to translation-dependent techniques. Furthermore, we compare our approach with large language models (LLMs) and show that while fine-tuned multilingual LLMs can achieve comparable results, English-centric LLMs struggle with these tasks.
Download

Paper Nr: 446
Title:

LLM Output Compliance with Handcrafted Linguistic Features: An Experiment

Authors:

Andrei Olar

Abstract: Can we control the writing style of large language models (LLMs) by specifying desired linguistic features? We address this question by investigating the impact of handcrafted linguistic feature (HLF) instructions on LLM-generated text. Our experiment evaluates various state-of-the-art LLMs using prompts incorporating HLF statistics derived from corpora of CNN articles and Yelp reviews. We find that LLMs demonstrate sensitivity to these instructions, particularly when tasked with conforming to concrete features like word count. However, compliance with abstract features, such as lexical variation, proves more challenging, often resulting in negative impacts on compliance. Our findings highlight the potential and limitations of utilizing HLFs for guiding LLM text generation and underscore the need for further research into optimizing prompt design and feature selection.
Download

Paper Nr: 448
Title:

Analysis of the Effectiveness of LLMs in Handwritten Essay Recognition and Assessment

Authors:

Daisy Cristine Albuquerque da Silva, Carlos Luiz Ferreira, Sérgio dos Santos Cardoso Silva and Juliano Bruno de Almeida Cardoso

Abstract: This study investigates the application of Large Language Models (LLMs) for handwritten essay recognition and evaluation within the Military Institute of Engineering (IME) selection process. Utilizing a two-stage methodology, 100 handwritten essays were transcribed using LLMs and subsequently evaluated against predefined linguistic and content criteria by both open-source and closed-source LLMs, including GPT-3.5, GPT-4, o1, LLaMA, and Mixtral. The evaluations were compared to those conducted by IME professors to assess reliability, alignment, and limitations. Results indicate that closed-source models like o1 demonstrated strong reliability and alignment with human evaluations, particularly in language-related criteria, though they exhibited a tendency to assign higher scores overall. In contrast, open-source models displayed weaker correlations and lower variance, limiting their effectiveness for nuanced assessment tasks. The study highlights the potential of LLMs as complementary tools for automated essay evaluation while identifying challenges such as variability in human and model evaluations, the need for advanced prompt engineering, and the necessity of incorporating diverse essay formats for improved generalizability. These findings provide insights into optimizing LLM performance in educational contexts.
Download

Paper Nr: 453
Title:

Improved Binary Elk Herd Optimizer with Fitness Balance Distance for Feature Selection Using Gene Expression Data

Authors:

Mohamed Wajdi Ouertani, Raja Oueslati and Ghaith Manita

Abstract: This research paper introduces an enhanced version of the Binary Elk Herd Optimizer (BEHO), integrated with a Fitness Distance Balance (FDB) mechanism called FDB-BEHO, tailored for high-dimensional optimization tasks. This study evaluates the performance of FDB-BEHO across multiple gene expression datasets, focusing on feature selection in bioinformatics—a domain characterized by complex, high-dimensional data. The FDB mechanism is designed to prevent premature convergence by maintaining an optimal balance between exploration and exploitation, utilizing a diversity measure that adjusts dynamically based on the fitness-distance correlation among solutions. Comparative analyses demonstrate that FDB-BEHO surpasses traditional meta-heuristic algorithms in fitness values and classification accuracy and reduces the number of selected features, thereby enhancing model simplicity and interpretability. These results validate the effectiveness of FDB-BEHO in navigating complex solution spaces efficiently and underscore its potential applicability in other domains requiring robust feature selection capabilities. The study’s findings suggest that incorporating diversity-enhancing mechanisms like FDB can significantly improve the performance of binary optimization algorithms, offering promising directions for future research in optimization technology.
Download

Paper Nr: 466
Title:

Large Language Models for Summarizing Czech Historical Documents and Beyond

Authors:

Václav Tran, Jakub Šmíd, Jiří Martínek, Ladislav Lenc and Pavel Král

Abstract: Text summarization is the task of shortening a larger body of text into a concise version while retaining its essential meaning and key information. While summarization has been significantly explored in English and other high-resource languages, Czech text summarization, particularly for historical documents, remains underexplored due to linguistic complexities and a scarcity of annotated datasets. Large language models such as Mistral and mT5 have demonstrated excellent results on many natural language processing tasks and languages. Therefore, we employ these models for Czech summarization, resulting in two key contributions: (1) achieving new state-of-the-art results on the modern Czech summarization dataset SumeCzech using these advanced models, and (2) introducing a novel dataset called Posel od ˇCerchova for summarization of historical Czech documents with baseline results. Together, these contributions provide a great potential for advancing Czech text summarization and open new avenues for research in Czech historical text processing.
Download

Paper Nr: 467
Title:

Agentic AI for Behavior-Driven Development Testing Using Large Language Models

Authors:

Ciprian Paduraru, Miruna Zavelca and Alin Stefanescu

Abstract: Behavior-driven development (BDD) testing significantly improves communication and collaboration between developers, testers and business stakeholders, and ensures that software functionality meets business requirements. However, the benefits of BDD are often overshadowed by the complexity of writing test cases, making it difficult for non-technical stakeholders. To address this challenge, we propose BDDTestAIGen, a framework that uses Large Language Models (LLMs), Natural Language Processing (NLP) techniques, human-in-the-loop and Agentic AI methods to automate BDD test creation. This approach aims to reduce manual effort and effectively involve all project stakeholders. By fine-tuning an open-source LLM, we improve domain-specific customization, data privacy and cost efficiency. Our research shows that small models provide a balance between computational efficiency and ease of use. Contributions include the innovative integration of NLP and LLMs into BDD test automation, an adaptable open-source framework, evaluation against industry-relevant scenarios, and a discussion of the limitations, challenges and future directions in this area.
Download

Paper Nr: 468
Title:

A Comparative Study of CNNs and Vision-Language Models for Chart Image Classification

Authors:

Bruno Côme, Maxime Devanne, Jonathan Weber and Germain Forestier

Abstract: Chart image classification is a critical task in automating data extraction and interpretation from visualizations, which are widely used in domains such as business, research, and education. In this paper, we evaluate the performance of Convolutional Neural Networks (CNNs) and Vision-Language Models (VLMs) for this task, given their increasing use in various image classification and comprehension tasks. We constructed a diverse dataset of 25 chart types, each containing 1,000 images, and trained multiple CNN architectures while also assessing the zero-shot generalization capabilities of pre-trained VLMs. Our results demonstrate that CNNs, when trained specifically for chart classification, outperform VLMs, which nonetheless show promising potential without the need for task-specific training. These findings underscore the importance of CNNs in chart classification while highlighting the unexplored potential of VLMs with further fine-tuning, making this task crucial for advancing automated data visualization analysis.
Download

Paper Nr: 472
Title:

Automatic Analysis of App Reviews Using LLMs

Authors:

Sadeep Gunathilaka and Nisansa de Silva

Abstract: Large Language Models (LLMs) have shown promise in various natural language processing tasks, but their effectiveness for app review classification to support software evolution remains unexplored. This study evaluates commercial and open-source LLMs for classifying mobile app reviews into bug reports, feature requests, user experiences, and ratings. We compare the zero-shot performance of GPT-3.5and Gemini Pro 1.0, finding that GPT-3.5 achieves superior results with an F1 score of 0.849. We then use GPT-3.5 to autonomously annotate a dataset for fine-tuning smaller open-source models. Experiments with Llama 2and Mistralshow that instruction fine-tuning significantly improves performance, with results approaching commercial models. We investigate the trade-off between training data size and the number of epochs, demonstrating that comparable results can be achieved with smaller datasets and increased training iterations. Additionally, we explore the impact of different prompting strategies on model performance. Our work demonstrates the potential of LLMs to enhance app review analysis for software engineering while highlighting areas for further improvement in open-source alternatives.
Download

Paper Nr: 476
Title:

A Federated Approach to Enhance Calibration of Distributed ML-Based Intrusion Detection Systems

Authors:

Jacopo Talpini, Nicolò Civiero, Fabio Sartori and Marco Savi

Abstract: Network intrusion detection systems (IDSs) are a major component for network security, aimed at protecting network-accessible endpoints, such as IoT devices, from malicious activities that compromise confidentiality, integrity, or availability within the network infrastructure. Machine Learning models are becoming a popular choice for developing an IDS, as they can handle large volumes of network traffic and identify increasingly sophisticated patterns. However, traditional ML methods often require a centralized large dataset thus raising privacy and scalability concerns. Federated Learning (FL) offers a promising solution by enabling a collaborative training of an IDS, without sharing raw data among clients. However, existing research on FL-based IDSs primarily focuses on improving accuracy and detection rates, while little or no attention is given to a proper estimation of the model’s uncertainty in making predictions. This is however fundamental to increase the model’s reliability, especially in safety-critical applications, and can be addressed by an appropriate model’s calibration. This paper introduces a federated calibration approach that ensures the efficient distributed training of a calibrator while safeguarding privacy, as no calibration data has to be shared by clients with external entities. Our experimental results confirm that the proposed approach not only preserves model’s performance, but also significantly enhances confidence estimation, making it ideal to be adopted by IDSs.
Download

Paper Nr: 480
Title:

A Survey of Advanced Classification and Feature Extraction Techniques Across Various Autism Data Sources

Authors:

Sonia Slimen, Anis Mezghani, Monji Kherallah and Faiza Charfi

Abstract: Autism, often known as autism spectrum disorder (ASD), is characterized by a range of neurodevelopmental difficulties that impact behavior, social relationships, and communication. Early diagnosis is crucial to provide timely interventions and promote the best possible developmental outcomes. Although well-established, traditional methods such as behavioral tests, neuropsychological assessments, and clinical facial feature analysis are often limited by societal stigma, expense, and accessibility. In recent years, artificial intelligence (AI) has emerged as a transformative tool. AI utilizes advanced algorithms to analyze a variety of data modalities, including speech patterns, kinematic data, facial photographs, and magnetic resonance imaging (MRI), in order to diagnose ASD. Each modality offers unique insights: kinematic investigations show anomalies in movement patterns, face image analysis reveals minor phenotypic indicators, speech analysis shows aberrant prosody, and MRI records neurostructural and functional problems. By accurately extracting information from these modalities, deep learning approaches enhance diagnostic efficiency and precision. However, challenges remain, such as the need for diverse datasets to build robust models, potential algorithmic biases, and ethical concerns regarding the use of private biometric data. This paper provides a comprehensive review of feature extraction methods across various data modalities, emphasising how they might be included into AI frameworks for the detection of ASD. It emphasizes the potential of multimodal AI systems to revolutionize autism diagnosis and their responsible implementation in clinical practice by analyzing the advantages, limitations, and future directions of these approaches.
Download

Paper Nr: 481
Title:

SAPG: Semantically-Aware Paraphrase Generation with AMR Graphs

Authors:

Afonso Sousa and Henrique Lopes Cardoso

Abstract: Automatically generating paraphrases is crucial for various natural language processing tasks. Current approaches primarily try to control the surface form of generated paraphrases by resorting to syntactic graph structures. However, paraphrase generation is rooted in semantics, but there are almost no works trying to leverage semantic structures as inductive biases for the task of generating paraphrases. We propose SAPG, a semantically-aware paraphrase generation model, which encodes Abstract Meaning Representation (AMR) graphs into a pretrained language model using a graph neural network-based encoder. We demonstrate that SAPG enables the generation of more diverse paraphrases by transforming the input AMR graphs, allowing for control over the output generations’ surface forms rooted in semantics. This approach ensures that the semantic meaning is preserved, offering flexibility in paraphrase generation without sacrificing fluency or coherence. Our extensive evaluation on two widely-used paraphrase generation datasets confirms the effectiveness of this method.
Download

Paper Nr: 488
Title:

Leveraging Attention Mechanisms for Interpretable Human Embryo Image Segmentation

Authors:

Wided Souid Miled and Nozha Chakroun

Abstract: In-vitro Fertilization (IVF) is a widely used assisted reproductive technology where embryos are cultured under controlled laboratory conditions. The selection of a high-quality blastocyst, typically reached five days after fertilization, is crucial to the success of the IVF procedure. Therefore, evaluating embryo quality at this stage is essential to optimize IVF outcomes. Advances in neural network architectures, particularly Convolutional Neural Networks (CNNs), have enhanced decision-making in IVF. However, ensuring both accuracy and interpretability in these models remains a challenge. This paper focuses on improving human blastocyst segmentation by combining channel attention mechanisms with a ResNet50 model within an encoder-decoder architecture. The method accurately identifies key blastocyst components such as inner cell mass (ICM), trophectoderm (TE), and zona pellucida (ZP). Our approach was validated on a publicly available human embryo dataset, achieving Intersection over Union (IoU) scores of 83.09% for ICM, 86.87% for ZP, and 81.1% for TE, outperforming current state-of-the-art methods. These results demonstrate the potential of deep learning to improve both accuracy and interpretability in embryo quality assessment.
Download

Short Papers
Paper Nr: 18
Title:

Facial Empathy Analysis Through Deep Learning and Computer Vision Techniques in Mixed Reality Environments

Authors:

Insaf Setitra, Domitile Lourdeaux and Louenas Bounia

Abstract: This paper introduces a novel approach for facial empathy analysis using deep learning and computer vision techniques within mixed reality environments. The primary objective is to detect and quantify empathic responses based on facial expressions, establishing the link between empathy and facial expressions. We propose the Deep Convolutional Neural Network with the Exponential Linear Unit activation function (ELU-DCNN). We moreover design an augmented reality platform with two main features (i). virtual overlay of a VR headset on the user’s face and (ii). facial emotion recognition for users wearing the VR headset. Our target is to analyse facial expressions in immersed environments in order to assess the empathy of users while being immersed in specific environments. Our results analyse the feasibility and effectiveness of these models in detecting and quantifying empathy through facial expressions. This work contributes to the growing field of affective computing and highlights the potential of integrating advanced computer vision techniques in mixed reality applications to better understand human emotional responses.
Download

Paper Nr: 20
Title:

Towards More Robust Transcription Factor Binding Site Classifiers Using Out-of-Distribution Data

Authors:

István Megyeri and Gergely Pap

Abstract: The use of deep learning methods for solving tasks in computational biology has increased in recent years. Many challenging problems are now addressed with novel architectures, training strategies and techniques involved in deep learning such as gene expression prediction, identifying splicing patterns, and DNA-protein binding site classification. Moreover, interpretability has become a key component of those methods used to solve computational biology tasks. Gaining a novel insight by analyzing the learners is a key factor. However, most deep learning models are hard to interpret, and they are prone to learn features which generalize poorly. In this study, we examine the robustness of high performing neural networks using in-distribution (ID) and out-of-distribution (OOD) examples. We demonstrate our findings in two different tasks taken from the domain of DNA-protein binding site classification and show that the overconfident and incorrect predictions are a result of the training data that has been built exclusively from ID samples. Adding OOD data to the training process enhances the reliability of the networks and it improves the performance on the ID tasks.
Download

Paper Nr: 22
Title:

Improving Quality of Entity Resolution Using a Cascade Approach

Authors:

Khizer Syed, Onais Khan Mohammed, John Talburt, Adeeba Tarannum, Altaf Mohammed, Mudasar Ali Mir and Mahboob Khan Mohammed

Abstract: Entity Resolution (ER) is a critical technique in data management, designed to determine whether two or more data references correspond to the same real-world entity. This process is essential for cleansing datasets and linking information across diverse records. A variant of this technique, Binary Entity Resolution, focuses on the direct comparison of data pairs without incorporating the transitive closure typically found in cluster-based approaches. Unlike cluster-based ER, where indirect linkages imply broader associations among multiple records (e.g., A is linked with B, and B is linked with C, thereby linking A with C indirectly), Binary ER performs pairwise matching, resulting in a straightforward outcome—a series of pairs from two distinct sources. In this paper, we present a novel improvement to the cascade process used in entity resolution. Specifically, our data-centric, descending confidence cascade approach systematically orders linking methods based on their confidence levels in descending order. This method ensures that higher confidence methods, which are more accurate, are applied first, potentially enhancing the accuracy of subsequent, lower-confidence methods. As a result, our approach produces better quality matches than traditional methods that do not utilize a cascading approach, leading to more accurate entity resolution while maintaining high-quality links. This improvement is particularly significant in Binary ER, where the focus is on pairwise matches, and the quality of each link is crucial.
Download

Paper Nr: 24
Title:

Impact of Extended Clauses on Local Search Solvers for Max-SAT

Authors:

Federico Heras

Abstract: Previous research has demonstrated that several techniques based on the resolution rule for Max-SAT are effective in improving results and boost the search, either as a preprocessing step or when embedded into specific Max-SAT solving algorithms, such as branch-and-bound and Stochastic Local Search (abbreviated SLS) algorithms. These techniques typically lead to a simplified and reduced Max-SAT formula, thereby enabling the algorithms to find solutions more efficiently. In this paper, we take a different approach by introducing a preprocessing step that, in contrast to prior methods, increases the size of the Max-SAT formula based on the Extension Rule. Our objective is to examine how this expansion of the problem instance impacts the performance of SLS algorithms. The empirical results indicate that for a subset of SLS algorithms, this approach yields improved solutions. This finding is significant as it challenges the conventional wisdom that smaller, simplified formulas are always better for all kind of solvers.
Download

Paper Nr: 41
Title:

Intelligent Health & Mission Management Architecture for Autonomous and Resilient Distributed Space Systems

Authors:

Mohammad Reza Jabbarpour, Ghaith El-Dalahmeh, Bao Quoc Vo and Ryszard Kowalczyk

Abstract: Distributed Space Systems (DSS) play a vital role in the success of multi-spacecraft missions, which are garnering considerable attention because of their affordability through lower costs of multiple smaller spacecraft, adaptability through reconfiguration, and resilience to failure through redundancy. These systems enable col-laborative endeavours among spacecraft, thus amplifying exploration capabilities within such missions. Nevertheless, the presence of multiple satellites amplifies the system’s complexity and raises the probability of fault occurrences. Consequently, an efficient health and mission management (HMM) system capable of accurately detecting and identifying faults within such a complex system is imperative to enhance mission success. In this study, we introduce an innovative Intelligent Agent-based HMM (IHMM) architecture for multi-spacecraft systems, leveraging Intelligent Agents (IAs) to seamlessly integrate mission success with satellite health and resilience. A thorough exploration and classification of diverse data sources suitable for integration into IAs is conducted, categorised according to their deployment type and intended roles. To evaluate and validate our proposed architecture, we conducted a preliminary analysis using one-time and continuous friction faults on a reaction wheel. The experiments show our approach outperforms traditional methods by proactively adapting control strategies in real-time and preventing saturation of other reaction wheels.
Download

Paper Nr: 44
Title:

Explaining Mammographic Texture: The Role of View and Abnormality Type in Early Cancer Diagnosis

Authors:

Bianca Iacob and Laura Diosan

Abstract: Detecting breast cancer at an early stage significantly increases the chances of successful treatment and survival. Understanding the full topology of various abnormalities requires analyzing multiple mammography views. This study evaluates the performance of mammographic views in detecting abnormalities, focusing on calcifications and masses, to enhance early cancer diagnosis. By examining the importance of considering both the type of abnormality and the mammographic view, we aim to identify key factors influencing detection accuracy. Additionally, we investigate whether incorporating textural features such as GLCM, GLRLM, and GLSZM can improve overall model performance. Our findings underscore the necessity of a tailored approach in mammographic analysis. These insights are crucial for advancing early diagnostic capabilities and improving patient outcomes.
Download

Paper Nr: 45
Title:

Diff-SySC: An Approach Using Diffusion Models for Semi-Supervised Image Classification

Authors:

Paul-Dumitru Orășan, Alexandra-Ioana Albu and Gabriela Czibula

Abstract: Diffusion models have revolutionized the field of generative machine learning due to their effectiveness in capturing complex, multimodal data distributions. Semi-supervised learning represents a technique that allows the extraction of information from a large corpus of unlabeled data, assuming that a small subset of labeled data is provided. While many generative methods have been previously used in semi-supervised learning tasks, only few approaches have integrated diffusion models in such a context. In this work, we are adapting state-of-the-art generative diffusion models to the problem of semi-supervised image classification. We propose Diff-SySC, a new semi-supervised, pseudo-labeling pipeline which uses a diffusion model to learn the conditional probability distribution characterizing the label generation process. Experimental evaluations highlight the robustness of Diff-SySC when evaluated on image classification benchmarks and show that it outperforms related work approaches on CIFAR-10 and STL-10, while achieving competitive performance on CIFAR-100. Overall, our proposed method outperforms the related work in 90.74% of the cases.
Download

Paper Nr: 47
Title:

On the Prediction of a Nonstationary Geometric Distribution Based on Bayes Decision Theory

Authors:

Daiki Koizumi

Abstract: This paper considers a prediction problem with a nonstationary geometric distribution in terms of Bayes decision theory. The proposed nonstationary statistical model contains a single hyperparameter, which is used to express the nonstationarity of the parameter of the geometric distribution. Furthermore, the proposed predictive algorithm is based on both the posterior distribution of the nonstationary parameter and the predictive distribution for data, operating with a Bayesian context. Each predictive estimator satisfies the Bayes optimality, which guarantees a minimum mean error rate with the proposed nonstationary probability model, a loss function, and a prior distribution of the parameter in terms of Bayes decision theory. Furthermore, an approximate maximum likelihood estimation method for the hyperparameter based on numerical calculation has been considered. Finally, the predictive performance of the proposed algorithm has been evaluated in terms of both the model selection theory and the predictive mean squared error by comparison with the stationary geometric distribution using real web traffic data.
Download

Paper Nr: 48
Title:

Facility Layout Generation Using Hierarchical Reinforcement Learning

Authors:

Shunsuke Furuta, Hiroyuki Nakagawa and Tatsuhiro Tsuchiya

Abstract: Facility Layout Problem (FLP), which is an optimization problem aimed at determining the optimal placement of facilities within a specified site, faces limitations in existing methods that use genetic algorithms (GA) and metaheuristic approaches. These methods require accurately specifying constraints for facility placement, making them difficult to utilize effectively in environments with few skilled workers. In layout generation using reinforcement learning-based methods, the need to consider multiple requirements results in an expanded search space, which poses a challenge. In this study, we implemented a system that adopts hierarchical reinforcement learning and evaluated its performance by applying it to existing benchmark problems. As a result, we were able to confirm that the system could stably generate facility layouts that meet the given conditions while addressing the issues found in previous methods.
Download

Paper Nr: 50
Title:

Optimizing Elevator Performance with SARL Multi-Agent Systems: A Distributed Approach for Enhanced Responsiveness and Efficiency

Authors:

Vy Le, Oliver Harold Joegensen, Tin Nguyen, Khang Nguyen Hoang and Ginel Dorleon

Abstract: Elevators play a pivotal role in modern urban living, boosting productivity and convenience efficiently. In elevator systems, the optimization of Multi-Agent Systems (MAS) is indispensable as it enhances agent coordination, adaptability, delay reduction, client satisfaction, and resource use. In this paper, we introduce an algorithm based on SARL MAS designed to enhance elevator controller performance. Our approach compares Centralized and Distributed Agent Systems, demonstrating the superiority of Distributed Agent Systems due to their improved responsiveness, efficiency, and adaptability. Our findings provide valuable insight into the use of SARL MAS not only for elevator control but also for other applications such as queue management systems and resource allocation in computing, highlighting the benefits of a distributed approach.
Download

Paper Nr: 56
Title:

Beyond Equality Matching: Custom Loss Functions for Semantics-Aware ICD-10 Coding

Authors:

Monah Bou Hatoum, Jean Claude Charr, Alia Ghaddar, Christophe Guyeux and David Laiymani

Abstract: Background: Accurate ICD-10 coding is vital for healthcare operations, yet manual processes are inefficient and error-prone. Machine learning offers automation potential but struggles with complex relationships between codes and clinical text. Objective: We propose a semantics-aware approach using custom loss functions to improve accuracy and clinical relevance in multi-label ICD-10 coding by leveraging cosine similarity to measure semantic relatedness between predicted and actual codes. Methods: Four custom loss functions (True Label Cardinality Loss (TLCL), Predicted Label Cardinality Loss (PLCL), Balanced Harmonic Mean Loss (BHML), and Weighted Harmonic Mean Loss (WHML)) were designed to capture hierarchical and semantic relationships. These were validated on a dataset of 9.57 million clinical notes from 24 medical specialties, using binary cross-entropy (BCE) loss as a baseline. Results: Our approach achieved a test micro-F1 score of 88.54%, surpassing the 74.64% baseline, with faster convergence and improved performance across specialties. Conclusion: Incorporating semantic similarity into the loss functions enhances ICD-10 code prediction, addressing clinical nuances and advancing machine learning in medical coding.
Download

Paper Nr: 57
Title:

ML-Based Virtual Sensing for Groundwater Monitoring in the Netherlands

Authors:

Laure Grisez, Shreshtha Sharma and Paolo Pileggi

Abstract: The increasing need for effective groundwater monitoring presents a valuable opportunity for Machine Learning (ML)-based virtual sensing, especially in regions with challenging sensor networks. This paper studies the practical application of two core ML models, Gaussian Process Regression (GPR) and Position Embedding Graph Convolutional Network (PEGCN), for predicting groundwater levels in The Netherlands. Additionally, other models, such as Graph Convolutional Networks and Graph Attention Networks, are mentioned for completeness, offering a broader understanding of ML methods in this domain. Through two experiments, sensor data reconstruction and virtual sensor prediction, we consider model performance, ease of implementation, and computational requirements. Practical lessons are drawn, emphasising that while advanced models like PEGCN excel in accuracy for complex environments, simpler models like GPR are better suited for non-experts due to their ease of use and minimal computational overhead. These insights highlight the trade-offs between accuracy and usability, with important considerations for real-world deployment by practitioners less familiar with ML.
Download

Paper Nr: 76
Title:

The Evolution of Criticality in Deep Reinforcement Learning

Authors:

Chidvilas Karpenahalli Ramakrishna, Adithya Mohan, Zahra Zeinaly and Lenz Belzner

Abstract: In Reinforcement Learning (RL), certain states demand special attention due to their significant influence on outcomes; these are identified as critical states. The concept of criticality is essential for the development of effective and robust policies and to improve overall trust in RL agents in real-world applications like autonomous driving. The current paper takes a deep dive into criticality and studies the evolution of criticality throughout training. The experiments are conducted on a new, simple yet intuitive continuous cliff maze environment and the Highway-env autonomous driving environment. Here, a novel finding is reported that criticality is not only learnt by the agent but can also be unlearned. We hypothesize that diversity in experiences is necessary for effective criticality quantification which is majorly driven by the chosen exploration strategy. This close relationship between exploration and criticality is studied utilizing two different strategies namely the exponential ε-decay and the adaptive ε-decay. The study supports the idea that effective exploration plays a crucial role in accurately identifying and understanding critical states.
Download

Paper Nr: 83
Title:

Improving Machine Learning Performance in Credit Scoring by Data Analysis and Data Pre-Processing

Authors:

Bogdan Ichim and Bilal Issa

Abstract: In this paper we showcase several data analysis and data pre-processing techniques which, when applied to the dataset Give Me Some Credit, lead to improvements in the performance of several machine learning algorithms in classifying defaulters and non-defaulters in comparison with other existing solutions from the literature. Our study underscores the importance of these techniques in data science in general, and in enhancing the machine learning outcomes in particular.
Download

Paper Nr: 86
Title:

Enhancing Personalized Decision-Making with the Balanced SPOTIS Algorithm

Authors:

Andrii Shekhovtsov, Jean Dezert and Wojciech Sałabun

Abstract: Besides being very useful in solving decision-making problems, classical Multi-Criteria Decision-Making (MCDM) techniques were designed to consider only profit and cost criteria. However, in some cases, it can be necessary to include more complex preferences of decision makers to better fit the problem. In such cases, modern MCDM methods such as Stable Preference Ordering Towards Ideal Solution (SPOTIS) can be used. The SPOTIS method allows for providing the Expected Solution Point (ESP) as input data for the decision problem. However, this approach can lead to unsatisfactory results if provided expert preferences are unreliable. To solve this problem, we propose a novel Balanced SPOTIS method with an ESP confidence parameter, which allows us to obtain a solution that is balanced between objectively ideal solutions and subjective expert preferences. We show how this new approach works in the case study of selecting a used car and provide an in-depth analysis of the problem using the new ESP confidence parameter for sensitivity analysis. Finally, to underline the advantages of the proposed approach, we compare it with the Expected Solution Point - Characteristic Objects Method (ESP-COMET).
Download

Paper Nr: 87
Title:

Towards Enhanced Decision Making: Integrating Weighted Expected Solution Points in Multi-Criteria Analysis

Authors:

Andrii Shekhovtsov, Bartłomiej Kizielewicz and Wojciech Sałabun

Abstract: Multi-Criteria Decision Analysis (MCDA) addresses complex problems across various domains by considering multiple decision criteria. This interdisciplinary field offers a systematic approach to decision-making, accommodating contradictory criteria and non-linear factors. Reference points are crucial in MCDA, facilitating a nuanced understanding of decision interrelationships and outcomes. While classic MCDA methods rely on static reference points, recent advances introduce manual allocation mechanisms, such as the Stable Preference Ordering Toward Ideal Solution (SPOTIS) and Characteristic Objects Method (COMET). However, incorporating reference points alone may overlook the significance of individual criteria, leading to the paradox of equal evaluations. To address this issue, an extension of the COMET method, Expected Solution Point (ESP-COMET), introduces weighted considerations to accurately reflect experts’ preferences. This paper proposes a methodology to integrate weights into ESP-COMET, enhancing its efficacy in decision modeling. We applied the proposed approach in the case study focused on the evaluation of hydrogen-fueled vehicles. Identifying the decision model and considering both the expected solution point and the relevance of the criteria to it, we demonstrated the utility of weighted ESP in improving decision-making processes.
Download

Paper Nr: 88
Title:

Comparison of Monolithic and Structural Decision Models Using the Hamming Distance

Authors:

Andrii Shekhovtsov, Amirkia Rafiei and Wojciech Sałabun

Abstract: This study shows a simple yet effective approach to comparing decision models built using the Characteristic Objects Method (COMET). The proposed approach is based on the Hamming Distance and its adaptation for complex decision problems that involve structural division of the model. We demonstrate the simulation-based proof-of-concept and then demonstrate the proposed approach to the case study of evaluating ten hydrogen cars based on the information provided by the manufacturers. We compared six decision models created based on the preferences of three decision makers expressed using Expected Solution Point (ESP) and Triad Support algorithm. The results obtained provide, on the one hand, some useful insights into customers’ preferences and expectations for hydrogen cars and, on the other hand, show the utilization of the proposed comparison methodology.
Download

Paper Nr: 91
Title:

Evaluating ResNet-Based Self-Explanatory Models for Breast Lesion Classification

Authors:

Adél Bajcsi, Camelia Chira and Annamária Szenkovits

Abstract: Breast cancer is one of the leading causes of mortality among women diagnosed with cancer. In recent years, numerous computer-aided diagnosis (CAD) systems have been proposed for the classification of breast lesions. This study investigates self-explanatory deep learning models, namely BagNet and ProtoPNet, for the classification of breast abnormalities. Our aim is to train models to distinguish between benign and malignant lesions in breast tissue using publicly available datasets, namely MIAS and DDSM. The study provides a comprehensive numerical comparison of the two self-explanatory models and their respective backbones, as well as a visual evaluation of model performance. The results indicate that, while the backbone (black-box model) exhibits slightly better performance, it does so at the expense of interpretability. Conversely, BagNet, despite being a simpler model, achieves results comparable to those of ProtoPNet. In addition, transfer learning and data augmentation techniques are employed to enhance the performance of the CAD system.
Download

Paper Nr: 94
Title:

A Vector Autoregression Model for Depicting the Relation Between Labour Market Economic Indicators and Real Wages in the United States Manufacturing Sector

Authors:

Ishaan Kshirsagar, Julian Márquez Simon, Nicolò Schätz, David Fraga Gonzalez and Conor Ryan

Abstract: In recent years, the US manufacturing sector and its labour market dynamics have gained importance in the face of resurgent protectionism and increased governmental strategic investment plans. Simultaneously, real wage growth in the manufacturing sector has diverged compared to the wider economy. While studies have previously analysed the relationship between labour market conditions and real wages in the wider economy, few have specifically evaluated the manufacturing sector in this respect. To this end, we selected a comprehensive list of economic indicators covering the key aspects of the sectoral labour market. Subsequently, a vector autoregression (VAR) model was developed, enabling us to account for time lags and the interconnectedness of each variable. In addition to this, graphs and plots were created to provide a visual understanding of the database, results, and labour market dynamics. The findings of our model suggest that the economic consensus on real wage determination in the wider economy also holds for the manufacturing sector. An important exception to this is the strongly negative relationship between the inflation rate and real wages.
Download

Paper Nr: 96
Title:

QuakeWake: A Novel AI-Based Early Earthquake Warning and Post-Quake Building Safety Guidance System

Authors:

Dhroov V. Bharatia

Abstract: Millions of people around the world suffer from earthquakes every year. This research introduces an innovative, mobile device-based approach for real-time earthquake detection and prediction. By discerning quake patterns from users’ regular usage patterns, a novel approach that prevents excessively draining battery uses an on-device neural network only when needed to detect earthquake tremors. Cloud servers running an AI module reliably predict the quake intensity and propagation pattern using signals from many users, enabling warning others who have yet to experience these tremors. It also detects buildings at high risk to reinhabit due to high relative floor displacement exceeding the building safety standards. A low-cost, affordable, and highly reliable optional adjunct device on the user’s premise captures tremors with higher accuracy than mobile devices. This enables effective building-wide earthquake warnings and eliminates fatalities due to post-earthquake building structural integrity issues. With a neural network trained with many past earthquake patterns, the mobile devices reliably detected quakes and the AI module accurately detected its propagation with 99% accuracy, warning users along its path. Moreover, the adjunct device adequately captured shifts in the building’s structure and reliably flagged the building as uninhabitable with more than 95% accuracy.
Download

Paper Nr: 100
Title:

Deep Learning for Frailty Classification Using Raw Inertial Sensor Gait Data

Authors:

Arslan Amjad, Agnieszka Szczęsna and Monika Błaszczyszyn

Abstract: The Frailty is a significant health issue in older adults that increases the risk of disability, decline in physiologic reserve and function, hospitalization, and even death. The social and economic impact of frailty increased due to the higher healthcare costs and the medical resources. The intervention of early frailty detection can prevent its progression and delay the disability, ultimately improving the quality of life in the elderly population. This study aims to propose a frailty classification system based on gait data collected from an Inertial Measurement Unit (IMU) sensor with the utilization of the Deep Learning (DL) approach. The individual’s frailty status is classified as robust, pre-frail, or frail. A publicly available dataset of 163 participants was utilized to analyze the raw gait signals and find the most effective DL for extracting gait patterns for frailty classification. DeepConvLSTM model has shown effective performance on raw IMU gait data with a balanced accuracy, precision, recall, and F1-score of 91%. The results show that the proposed methodology successfully classifies the pre-frail individuals, which demonstrate its potential to enhance frailty detection and intervention in clinical settings. This ultimately provides an improved healthcare system and a quality of life in elderly populations.
Download

Paper Nr: 103
Title:

Improving Temporal Knowledge Graph Completion via Tensor Decomposition with Relation-Time Context and Multi-Time Perspective

Authors:

Nam Le, Thanh Le and Bac Le

Abstract: Knowledge graphs have progressively incorporated temporal dimensions to effectively mirror the dynamism of real-world data, proving instrumental in applications ranging from question answering to event prediction. While the ubiquity of data incompleteness and well-established challenges of traditional knowledge graph embedding techniques remain acknowledged, this paper propels the frontier of this research area. We introduce Multi-Time Perspective Relation-Time Context ComplEx Embedding (MPComplEx), a tensor decomposition-based completion temporal knowledge graph model that not only assimilates temporal and relational interactions specific to timestamps but also integrates advanced time perspective features from the recent TPComplEx models. Our experimental evaluations illustrate dramatic enhancements over conventional models, achieving state-of-the-art performance on benchmark datasets with notable increments: 4.30%/4.79% on ICEWS-14, 11.70%/11.48% on ICEWS-05-15, 21.50%/31.20% on YAGO15k, and 26.90%/66.09% on GDELT in term of absolute/relative performance gains on mean reciprocal rank (MMR).
Download

Paper Nr: 109
Title:

Double Descent Phenomenon in Liquid Time-Constant Networks, Quantized Neural Networks and Spiking Neural Networks

Authors:

Hongqiao Wang and James Pope

Abstract: Recent theoretical machine learning research has shown that the traditional U-shaped bias-variance trade-off hypothesis is not correct for certain deep learning models. Complex models with more parameters will fit the training data well, often with zero training loss, but generalise poorly, a situation known as overfitting. However, some deep learning models have shown to generalise even after overfitting, a situation known as the double descent phenomenon. It is important to understand which deep learning models exhibit this phenomenon for practitioners to design and train these models effectively. It is not known whether more recent deep learning models exhibit this phenomenon. In this study, we investigate double descent in three recent neural network architectures: Liquid Time-Constant Networks (LTCs), Quantised Neural Networks (QNNs), and Spiking Neural Networks (SNNs). We conducted experiments on the MNIST, Fashion MNIST, and CIFAR-10 datasets by varying the widths of the hidden layers while keeping other factors constant. Our results show that LTC models exhibit a subtle form of double descent, while QNN models demonstrate a pronounced double descent on CIFAR-10. However, the SNN models did not show a clear pattern. Interestingly, we found the learning rate scheduler, label noise, and training epochs can significantly affect the double descent phenomenon.
Download

Paper Nr: 110
Title:

Tunisian Dialect Speech Corpus: Construction and Emotion Annotation

Authors:

Latifa Iben Nasr, Abir Masmoudi and Lamia Hadrich Belguith

Abstract: Speech Emotion Recognition (SER) using Natural Language Processing (NLP) for underrepresented dialects faces significant challenges due to the lack of annotated corpora. This research addresses this issue by constructing and annotating SERTUS (Speech Emotion Recognition in TUnisian Spontaneous speech), a novel corpus of spontaneous speech in the Tunisian Dialect (TD), collected from various domains such as sports, politics, and culture. SERTUS includes both registers of TD: the popular (familiar) register and the intellectual register, capturing a diverse range of emotions in spontaneous settings and natural interactions across different regions of Tunisia. Our methodology uses a categorical approach to emotion annotation and employs inter-annotator agreement measures to ensure the reliability and consistency of the annotations. The results demonstrate a high level of agreement among annotators, indicating the robustness of the annotation process. The study’s core contribution lies in its comprehensive and rigorous approach to the development of a dataset of spontaneous emotional speech in this dialect. The constructed corpus has significant potential applications in various fields, such as human-computer interaction, mental health monitoring, call center analytics, and social robotics. It also facilitates the development of more accurate and culturally nuanced SER systems. This work contributes to existing research by providing a high-quality annotated corpus while emphasizing the importance of including underrepresented dialects in NLP research.
Download

Paper Nr: 116
Title:

Fuzzy Logic for Cybersecurity: Intrusion Detection and Privacy Preservation with Synthetic Data

Authors:

Marina Soledad Iantorno and Khalil Beladda

Abstract: This research explores the use of fuzzy logic in intrusion detection systems (IDS) aiming to improve cybersecurity threat detection. Conventional machine learning models, like Decision Trees and Support Vector Machines, are evaluated against a fuzzy logic model that employs triangle and parallelogram-shaped membership functions to address the uncertainty in network traffic. The fuzzy logic system presented good performance, achieving greater accuracy, precision, and F1-scores than conventional models, particularly when using real network traffic data. Synthetic data produced by Wasserstein Generative Adversarial Networks (WGANs) was also used to evaluate the model's robustness and guarantee privacy protection in future studies. The relevance of this approach lies in its ability to provide more comprehensive threat detection, helping organizations safeguard their systems in environments where strict, rule-based models may fall short. The findings indicate that the fuzzy logic methodology is effective, even when applied to synthetic data, demonstrating its feasible choice for intrusion detection in sensitive contexts. Subsequent research will investigate the incorporation of deep learning methodologies and the modification of the model for distributed systems, focusing on scalability and real-time threat identification.
Download

Paper Nr: 124
Title:

Non-Invasive People Counting in Smart Buildings: Employing Machine Learning with Binary PIR Sensors

Authors:

Azad Shokrollahi, Fredrik Karlsson, Reza Malekian, Jan A. Persson and Arezoo Sarkheyli-Hägele

Abstract: People counting in smart buildings is crucial for the efficient management of building systems such as energy, space allocation, efficiency, and occupant comfort. This study investigates the use of two non-invasive binary Passive Infrared (PIR) sensors for estimating the number of people in seven office rooms with different people counting intervals. Previous studies often relied on sensor fusion or more complex signal-based PIR sensors, which increased hardware costs, raised privacy concerns, and added installation complexity. Our approach addresses these limitations by utilizing fewer sensors, reducing hardware costs, and simplifying installation, making it scalable and flexible for different room configurations, while also ensuring high consideration of privacy. Additionally, binary PIR sensors are typically part of smart building systems, eliminating the need for additional sensors. We employed several machine learning methods to analyze motion detected by binary PIR sensors, improving the accuracy of people counting estimates. We analyzed important features by extracting event count, duration, and density from sensor data, along with features from the room’s shape, to estimate the number of people. We used different machine learning models for estimating the number of people. Models like Gradient Boosting, XGBoost, MLP, and LGBM demonstrated superior performance for their strong ability to handle complex, non-linear relationships in sensor data, high-dimensional datasets, and imbalanced data, which are common challenges in people counting tasks using PIR sensors. These models were evaluated using performance metrics such as accuracy and F1-score. Additionally, the results show that features such as passage events and the number of detected events, combined with machine learning algorithms, can achieve good accuracy and reliability in people counting.
Download

Paper Nr: 137
Title:

Proof of Learning Applied to Binary Neural Networks

Authors:

Zoltán-Valentin Gyulai-Nagy

Abstract: This paper introduces a novel method that leverages binary neural networks (BNNs) for transaction validation on blockchains. Utilizing the computational capabilities of traditional Proof-of-Work systems, this approach generates multiple models suitable for real-world applications. BNNs are chosen for their smaller memory footprint, fitting well into blockchain validations and embedding within blocks. The method aligns with the Proof of Learning concept, requiring neural network training to create new blocks, while also incorporating computationally intensive heuristic approaches. Despite the lower precision of BNNs compared to traditional models, their reduced computational demand during inference is beneficial. The goal is to improve their precision through multiple training rounds and the use of evolutionary algorithms. This scalable approach can be customized to meet diverse application needs by allowing users to upload datasets for training specific models. Additionally, it is cost-effective as BNNs can be trained on low-cost devices, broadening access. This strategy aims to refine blockchain validation processes and produce usable models as a byproduct.
Download

Paper Nr: 138
Title:

Machine Learning Based Collaborative Filtering Using Jensen-Shannon Divergence for Context-Driven Recommendations

Authors:

Jihene Latrech, Zahra Kodia and Nadia Ben Azzouna

Abstract: This research presents a machine learning-based context-driven collaborative filtering approach with three steps: contextual clustering, weighted similarity assessment, and collaborative filtering. User data is clustered across 3 aspects, and similarity scores are calculated, dynamically weighted, and aggregated into a normalized User-User similarity matrix. Collaborative filtering is then applied to generate contextual recommendations. Experiments on the LDOS-CoMoDa dataset demonstrated good performance, with RMSE and MAE rates of 0.5774 and 0.3333 respectively, outperforming reference approaches.
Download

Paper Nr: 141
Title:

Brain-Driven Robotic Arm: Prototype Design and Initial Experiments

Authors:

Fatma Abdelhedi, Lama Aljedaani, Amal Abdallah Batheeb and Renad Abdullah Aldahasi

Abstract: Advances in robotic control have revolutionized assistive technologies for individuals with upper limb amputations. Daily tasks, which are often complex or time-consuming, can be challenging without assistance. Traditional assistive devices often demand significant physical effort and lack versatility, limiting user independence. In response, the Brain-Driven Robotic Arm project aims to develop an advanced assistive device that allows individuals with disabilities to control a robotic arm using their brain signals. Utilizing brain-computer interface (BCI) technology with electroencephalogram (EEG) signals, the system processes brain activity to generate commands for the robotic arm, offering a more intuitive and efficient assistive solution. The experimental setup integrates the 6-DOF Yahboom DOFBOT Robotic Arm Kit with the 14-Channel EPOC X EEG Headset, where the system control is managed via Python software, using the Latent Dirichlet Allocation (LDA) algorithm for AI-driven tasks.
Download

Paper Nr: 143
Title:

CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving

Authors:

Bhargava Uppuluri, Anjel Patel, Neil Mehta, Sridhar Kamath and Pratyush Chakraborty

Abstract: In autonomous driving, traditional Computer Vision (CV) agents often struggle in unfamiliar situations due to biases in the training data. Deep Reinforcement Learning (DRL) agents address this by learning from experience and maximizing rewards, which helps them adapt to dynamic environments. However, ensuring their generalization remains challenging, especially with static training environments. Additionally, DRL models lack transparency, making it difficult to guarantee safety in all scenarios, particularly those not seen during training. To tackle these issues, we propose a method that combines DRL with Curriculum Learning for autonomous driving. Our approach uses a Proximal Policy Optimization (PPO) agent and a Variational Autoencoder (VAE) to learn safe driving in the CARLA simulator. The agent is trained using two-fold curriculum learning, progressively increasing environment difficulty and incorporating a collision penalty in the reward function to promote safety. This method improves the agent’s adaptability and reliability in complex environments, and understand the nuances of balancing multiple reward components from different feedback signals in a single scalar reward function.
Download

Paper Nr: 148
Title:

Kernel-Level Malware Analysis and Behavioral Explanation Using LLMs

Authors:

Narumi Yoneda, Ryo Hatano and Hiroyuki Nishiyama

Abstract: In this study, we collected data on malware behavior and generated explanatory descriptions using a large language model (LLM). The objective of this study is to determine whether a given malware sample truly exhibits malicious behavior. To collect detailed information, we modified the Linux kernel to build a system capable of capturing information about the arguments and return values of invoked system calls. We subsequently analyzed the data obtained from our system for indications that the malware exhibited malicious or anti-analysis behavior. Additionally, we assessed whether the LLM could interpret this data and provide an explanation of the malware behavior. This approach constitutes a shift in focus from the method of attack, which is examined in the detection of the malware family, to an evaluation of the malicious nature of the actions performed by the malware. Our inferences demonstrated that our data could represent both what the malware “attempted to do” and what it “actually did,” and the LLM was able to accurately interpret this data and explain the malware behavior.
Download

Paper Nr: 149
Title:

A Systematic Review of Sustainable Supplier Selection Using Advanced Artificial Intelligence Methods

Authors:

Hanen Neji, Mouna Rekik, Lotfi Souifi and Ismail Bouassida Rodriguez

Abstract: Artificial intelligence (AI) algorithms have significantly advanced various fields, driving innovation in domains such as healthcare, finance, and sustainability. In the realm of sustainable development, selecting suppliers is crucial for promoting environmental responsibility and safeguarding the well-being of future generations. This complex decision-making process requires evaluating suppliers across numerous criteria. Multi-Criteria Decision-Making (MCDM) and AI techniques, including Natural Language Processing (NLP), Deep Learning (DL), and Machine Learning (ML), have emerged as powerful tools to address these challenges. However, these methods often face transparency issues and the risk of greenwashing, which can erode trust in sustainability assessments. To address this, we conducted a systematic literature review (SLR) of 44 papers published between 2019 and 2024, sourced from databases such as Springer (12 papers), IEEE Xplore Digital Library (11 papers), and Science Direct (21 papers). This review offers an equitable analysis of MCDM and AI models (NLP, DL, ML) for evaluating both supplier sustainability and the risk of greenwashing. Additionally, sentiment analysis techniques are integrated to enhance transparency and provide insights into stakeholder perceptions.
Download

Paper Nr: 155
Title:

Revisit the Algorithm Selection Problem for TSP with Spatial Information Enhanced Graph Neural Networks

Authors:

Ya Song, Laurens Bliek and Yingqian Zhang

Abstract: Algorithm selection is a well-known problem where researchers investigate how to construct useful features representing the problem instances and then apply feature-based machine learning models to predict the best algorithm for each instance. However, even for simple optimization problems like Euclidean Traveling Salesman Problem (TSP), there lacks a general and effective feature representation for problem instances. The important features of TSP are relatively well understood in the literature, based on extensive domain knowledge and post-analysis of the solutions. In recent years, Convolutional Neural Network (CNN) has gained popularity for TSP algorithm selection. Compared to traditional feature-based models, CNN has an automatic feature-learning ability and demands less domain expertise. However, it is still required to generate intermediate representations, i.e., multiple images to represent TSP instances first. In this paper, we revisit algorithm selection for TSP and propose GINES, a new Graph Neural Network (GNN) that uses city coordinates and distances as input. GINES introduces a novel message-passing mechanism and local feature extractor to learn TSP’s spatial information. Evaluation of two benchmarks shows GINES outperforms CNN and GINE models and surpasses traditional feature-based methods on one dataset. Our codes and datasets are available at https://github.com/lurenyi233/GINES TSP.
Download

Paper Nr: 158
Title:

Implementation of Quantum Machine Learning on Educational Data

Authors:

Sofía Ramos-Pulido, Neil Hernández-Gress, Glen S. Uehara, Andreas Spanias and Héctor G. Ceballos-Cancino

Abstract: This study is the first to implement quantum machine learning (QML) on educational data to predict alumni results. This study aims to show that we can design and implement QML algorithms for this application case and compare their accuracy with those of classical ML algorithms. We consider three target variables in a high-dimensional dataset with approximately 100 features and 25,000 instances or samples: whether an alumnus will secure a CEO position, alumni salary, and alumni satisfaction. These variables were selected because they provide insights into the effect of education on alumni careers. Due to the computational limitations of running QML on high-dimensional data, we propose to use principal component analysis for dimensionality reduction, a barycentric correction procedure for instance reduction, and two quantum-kernel ML algorithms for classification, namely quantum support vector classifier (QSVC) and Pegasos QSVC. We observe that currently one can implement quantum-kernel ML algorithms and achieve results comparable to those of classical ML algorithms. For example, the accuracy of the classical and quantum algorithms is 85% in predicting whether an alumnus will secure a CEO position. Although QML currently offers no time or accuracy advantages, these findings are promising as quantum hardware evolves.
Download

Paper Nr: 163
Title:

ForecastBoost: An Ensemble Learning Model for Road Traffic Forecasting

Authors:

Syed Muhammad Abrar Akber, Sadia Nishat Kazmi, Ali Muqtadir and Syed Muhammad Zubair Akber

Abstract: Accelerated urbanization is causing ever-increasing road traffic around the world. This rapid increase in road traffic is posing several challenges, such as road congestion, suboptimal emergency services due to inadequate road infrastructure and lack of economic sustainability. To overcome such challenges, intelligent transportation systems have recently become increasingly popular. Traffic prediction is an important part of such intelligent traffic management systems. Accurate traffic prediction leads to improved traffic flow, avoids congestion and optimizes the timing of traffic signals, resulting in higher vehicle fuel efficiency. Lower fuel consumption due to better fuel efficiency also limits the carbon footprints that help in combating global warming. To accurately predict road traffic, this paper proposes the ForecastBoost model, which leverages an ensemble learning approach to predict road traffic. ForecastBoost integrates two regression learning algorithms, namely Extreme Gradient Boosting and Categorical Boosting, to predict road traffic. The first component handles missing values and sparse data and the second handles categorical features without overfitting. We train the proposed ForecastBoost with a publicly available real-world traffic dataset. The obtained results are evaluated using similar state-of-the-art algorithms such as Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS), Series-cOre Fused Time Series (SOFTS) and TimesNET. We use a well-known performance metrics containing several performance parameters, including mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE), to evaluate the performance of the proposed Forecast-Boost. The evaluation results show that the proposed ForecastBoost outperforms the other models.
Download

Paper Nr: 169
Title:

Scaling Multi-Frame Transformers for End-to-End Driving

Authors:

Vasileios Kochliaridis, Filippos Moumtzidellis and Ioannis Vlahavas

Abstract: Vision-based end-to-end controllers hold the potential to revolutionize the production of Autonomous Vehicles by simplifying the implementation of navigation systems and reducing their development costs. However, the large-scale implementation of such controllers faces challenges, such as accurately estimating object trajectories and making robust real-time decisions. Advanced Deep Learning architectures combined with Imitation Learning provide a promising solution, allowing these controllers to learn from expert demonstrations to map observations directly to vehicle controls. Despite the progress, existing controllers still struggle with generalization and are difficult to train efficiently. In this paper, we introduce CILv3D, a novel video-based end-to-end controller that processes multi-view video frames and learns complex spatial-temporal features using attention mechanisms and 3D convolutions. We evaluate our approach by comparing its performance to the previous state-of-the-art and demonstrate significant improvements in the vehicle control accuracy. Our findings suggest that our approach could enhance the scalability and robustness of autonomous driving systems.
Download

Paper Nr: 170
Title:

Evaluating Biased Synthetic Data Effects on Large Language Model-Based Software Vulnerability Detection

Authors:

Lucas B. Germano, Lincoln Q. Vieira, Ronaldo R. Goldschmidt, Julio Cesar Duarte and Ricardo Choren

Abstract: Software security ensures data privacy and system reliability. Vulnerabilities in the development cycle can lead to privilege escalation, causing data exfiltration or denial of service attacks. Static code analyzers, based on predefined rules, often fail to detect errors beyond these patterns and suffer from high false positive rates, making rule creation labor-intensive. Machine learning offers a flexible alternative, which can use extensive datasets of real and synthetic vulnerability data. This study examines the impact of bias in synthetic datasets on model training. Using CodeBERT for C/C++ vulnerability classification, we compare models trained on biased and unbiased data, incorporating overlooked preprocessing steps to remove biases. Results show that the unbiased model achieves 98.5% accuracy, compared to 63.0% for the biased model, emphasizing the critical need to address dataset biases in training.
Download

Paper Nr: 174
Title:

Multi-Agent Path Finding Using Provisionally Booking Nodes for Pickup and Delivery Problems

Authors:

Daiki Shimada, Yuki Miyashita and Toshiharu Sugawara

Abstract: We propose an efficient method for determining subsequent movements based on temporarily generated shortest paths in the multi-agent pickup and delivery (MAPD) problem. The MAPD problem involves multiple agents (such as carrier robots) continuously performing transportation tasks in a vast environment with obstacles while avoiding collisions with other agents. Our method is an extension of the decentralized path-finding algorithms, priority inheritance with backtracking (PIBT), and can be efficient in environments with narrow one-way paths and few detours. Our method, PIBT with provisional booking (PIBT-PB), not only secures the next node as in PIBT but also provisionally books some nodes in advance based on dynamic priorities between agents to detect possible conflict earlier. Therefore, it reduces the number of “turning back” and wasted “wait-ing” actions in environments. Our experiments show that PIBT-PB is more efficient than the baselines, PIBT and windowed PIBT, and even in less restrictive environments, it performs as efficiently as PIBT.
Download

Paper Nr: 175
Title:

Decoding Persuasiveness in Eloquence Competitions: An Investigation into the LLM’s Ability to Assess Public Speaking

Authors:

Alisa Barkar, Mathieu Chollet, Matthieu Labeau, Beatrice Biancardi and Chloe Clavel

Abstract: The increasing importance of public speaking (PS) skills has fueled the development of automated assessment systems, yet the integration of large language models (LLMs) in this domain remains underexplored. This study investigates the application of LLMs for assessing PS by predicting persuasiveness. We propose a novel framework where LLMs evaluate criteria derived from educational literature and feedback from PS coaches, offering new interpretable textual features. We demonstrate that persuasiveness predictions of a regression model with the new features achieve a Root Mean Squared Error (RMSE) of 0.6, underperforming approach with hand-crafted lexical features (RMSE 0.51) and outperforming direct zero-shot LLM persuasiveness predictions (RMSE of 0.8). Furthermore, we find that only LLM-evaluated criteria of language level is predictable from lexical features (F1-score of 0.56), disapproving relations between these features. Based on our findings, we criticise the abilities of LLMs to analyze PS accurately. To ensure reproducibility and adaptability to emerging models, all source code and materials are publicly available on GitHub.
Download

Paper Nr: 178
Title:

An Efficient Genetic Algorithm for Service Placement in Fog Computing

Authors:

Dihia Bendjenahi, Chadia Moumeni and Malika Bessedik

Abstract: The FSPP (Fog Service Placement Problem) involves the allocation of fog and cloud resources to IoT applications while meeting specific application requirements, including deadlines and minimizing response time. This paper addresses the FSPP in heterogeneous fog-cloud computing environments using a genetic algorithm (GA) approach. Our proposed GA aims to jointly maximize total resource utilization while respecting application deadlines. The paper presents a detailed system model, problem formulation, and the proposed GA methodology. Experimental results demonstrate the effectiveness of the GA approach in optimizing resource allocation and meeting Quality of Service (QoS) requirements when compared to the first-fit heuristic and to the random approach.
Download

Paper Nr: 184
Title:

Sentiment-Enriched AI for Toxic Speech Detection: A Case Study of Political Discourses in the Valencian Parliament

Authors:

Antoni Mestre, Franccesco Malafarina, Joan Fons, Manoli Albert, Miriam Gil and Vicente Pelechano

Abstract: The increasing prevalence of toxic speech across various societal domains has raised significant concerns regarding its impact on communication and social interactions. In this context, the analysis of toxicity through AI techniques has gained prominence as a relevant tool for detecting and combating this phenomenon. This study proposes a novel approach to toxic speech detection by integrating sentiment analysis into binary classification models. By establishing a confusion zone for ambiguous probability scores, we direct uncertain cases to a sentiment analysis module that informs final classification decisions. Applied to political discourses in the Valencian Parliament, this sentiment-enriched approach significantly improves classification accuracy and reduces misclassifications compared to traditional methods. These findings underscore the effectiveness of incorporating sentiment analysis to enhance the robustness of toxic speech detection in complex political contexts, paving the way for future research in this relevant area.
Download

Paper Nr: 185
Title:

A Multitier Approach for Dynamic and Partially Observable Multiagent Path-Finding

Authors:

Anıl Doğru, Amin Deldari Alamdari, Duru Balpınarlı and Reyhan Aydoğan

Abstract: This paper introduces a novel Dynamic and Partially Observable Multiagent Path-Finding (DPO-MAPF) problem and presents a multitier solution approach accordingly. Unlike traditional MAPF problems with static obstacles, DPO-MAPF involves dynamically moving obstacles that are partially observable and exhibit unpredictable behavior. Our multitier solution approach combines centralized planning with decentralized execution. In the first tier, we apply state-of-the-art centralized and offline path planning techniques to navigate around static, known obstacles (e.g., walls, buildings, mountains). In the second tier, we propose a decentralized and online conflict resolution mechanism to handle the uncertainties introduced by partially observable and dynamically moving obstacles (e.g., humans, vehicles, animals, and so on). This resolution employs a metaheuristic-based revision process guided by a consensus protocol to ensure fair and efficient path allocation among agents. Extensive simulations validate the proposed framework, demonstrating its effectiveness in finding valid solutions while ensuring fairness and adaptability in dynamic and uncertain environments.
Download

Paper Nr: 186
Title:

Improving Controlled Text Generation via Neuron-Level Control Codes

Authors:

Jay Orten and Nancy Fulda

Abstract: Task-specific text generation is a highly desired feature for language models, as it allows the production of text completions that are either broadly or subtly aligned with specific objectives. By design, many neural networks switch between multiple behaviors during inference - for example, when selecting a target language in many-to-many translation systems. Such task-specific information is usually presented to the network as an augmentation of its input data. In this work, we explore an alternate approach: transmitting task information directly to each neuron in the network. This removes the need for task information to propagate forward during training, a particularly critical advantage in low-resource settings where maximum benefit must be extracted from each training example. To test this approach, we train over 160 language models from scratch with a large variety of architectures and configurations. Our results show that models with neuron-level augmentation can experience increased learning speed, improved final generation accuracy, and even novel learning capabilities, with greater benefits as network depth increases.
Download

Paper Nr: 188
Title:

Data-Driven Fairness Generalization for Deepfake Detection

Authors:

Uzoamaka Ezeakunne, Chrisantus Eze and Xiuwen Liu

Abstract: Despite the progress made in deepfake detection research, recent studies have shown that biases in the training data for these detectors can result in varying levels of performance across different demographic groups, such as race and gender. These disparities can lead to certain groups being unfairly targeted or excluded. Traditional methods often rely on fair loss functions to address these issues, but they under-perform when applied to unseen datasets, hence, fairness generalization remains a challenge. In this work, we propose a data-driven framework for tackling the fairness generalization problem in deepfake detection by leveraging synthetic datasets and model optimization. Our approach focuses on generating and utilizing synthetic data to enhance fairness across diverse demographic groups. By creating a diverse set of synthetic samples that represent various demographic groups, we ensure that our model is trained on a balanced and representative dataset. This approach allows us to generalize fairness more effectively across different domains. We employ a comprehensive strategy that leverages synthetic data, a loss sharpness-aware optimization pipeline, and a multi-task learning framework to create a more equitable training environment, which helps maintain fairness across both intra-dataset and cross-dataset evaluations. Extensive experiments on benchmark deepfake detection datasets demonstrate the efficacy of our approach, surpassing state-of-the-art approaches in preserving fairness during cross-dataset evaluation. Our results highlight the potential of synthetic datasets in achieving fairness generalization, providing a robust solution for the challenges faced in deepfake detection.
Download

Paper Nr: 200
Title:

Efficient Automatic Data Augmentation of CDT Images to Support Cognitive Screening

Authors:

Nina Hosseini-Kivanani, Inês Oliveira, Sena Kilinç and Luis A. Leiva

Abstract: We investigate the effectiveness of learnable and non-learnable automatic data augmentation (AutoDA) techniques in enhancing Deep Learning (DL) models for classifying Clock Drawing Test (CDT) images used in cognitive dysfunction screening. The classification is between healthy controls (HCs) and individuals with mild cognitive impairment (MCI). Specifically, we evaluate TrivialAugment (TA) and UniformAugment (UA), adapted for clinical image classification to address data scarcity and class imbalance. Our experiments across three public datasets demonstrate significant improvements in model performance and generalization. Notably, TA increased classification accuracy by up to 15 points, while UA achieved a 12-point improvement. These techniques offer a computationally efficient alternative to learnable methods like RandAugment (RA), which we also compare against, delivering comparable (and sometimes better) results with a much lower computational overhead. Our findings indicate that AutoDA techniques, particularly TA and UA, can be effectively applied in clinical settings, providing robust tools for the early detection of cognitive disorders, including Alzheimer’s disease and dementia.
Download

Paper Nr: 203
Title:

Public Transport Network Design for Equality of Accessibility via Message Passing Neural Networks and Reinforcement Learning

Authors:

Duo Wang, Andrea Araldo and Maximilien Chau

Abstract: Graph learning involves embedding relevant information about a graph’s structure into a vector space. However, graphs often represent objects within a physical or social context, such as a Public Transport (PT) graph, where nodes represent locations surrounded by opportunities. In these cases, the performance of the graph depends not only on its structure but also on the physical and social characteristics of the environment. Optimizing a graph may require adapting its structure to these contexts. This paper demonstrates that Message Passing Neural Networks (MPNNs) can effectively embed both graph structure and environmental information, enabling the design of PT graphs that meet complex objectives. Specifically, we focus on accessibility, an indicator of how many opportunities can be reached in a unit of time. We set the objective to design a “equitable” PT graph with a lower accessibility inequality. We combine MPNN with Reinforcement Learning (RL) and show the efficacy of our method against metaheuristics in a use case representing in simplified terms the city of Montreal. Our superior results show the capacity of MPNN and RL to capture the intricate relations between the PT graph and the environment, which metaheuristics do not achieve.
Download

Paper Nr: 206
Title:

Building a Risk Profile for Detecting Terrorism Financing

Authors:

David Makiya and João Balsa

Abstract: This paper presents a novel and theoretical approach to detecting terrorism financing through the development of risk-based transaction profiles using machine learning models. By integrating client and transaction data, the proposed framework employs unsupervised clustering techniques to identify suspicious financial activities. A multi-agent system, coupled with National Risk Indicators (NRI) and Long Short-Term Memory (LSTM) neural networks, can enhance predictive capabilities for easier detection. The proposed model addresses the evolving strategies of terrorist groups, offering financial institutions a dynamic and scalable tool for mitigating terrorism financing risks while improving accuracy in anti-money laundering (AML) and counter-terrorism financing (CTF) efforts.
Download

Paper Nr: 207
Title:

Reptile Search Algorithm Based Feature Selection Approach for Intrusion Detection

Authors:

Maher O. Al-Khateeb and Ali Douik

Abstract: In Cybersecurity, the Rise of Machine Learning (ML) Based Security Solutions Has Led to a New Era of Defense Against Evolving Threats, with Intrusion Detection (ID) Systems at the Forefront. However, the Effectiveness of These Systems Is Profoundly Influenced by the Quality and Relevance of the Input Features. the Presence of Redundant Features Can Compromise Their Performance, Making Feature Selection (FS) a Crucial Step in Optimizing ID Solutions. This Paper Uses the Reptile Search Algorithm (RSA) as a Powerful FS Method. It Offers a Gradient-Free Approach, Avoiding Local Optima and Enabling Global Optimization. Comparative Analysis Using Five Freely Available ID Datasets and Benchmarked Against Several Methods Validated Superior Performance of the RSA for ID.
Download

Paper Nr: 210
Title:

Generative AI for Human 3D Body Emotions: A Dataset and Baseline Methods

Authors:

Ciprian Paduraru, Petru-Liviu Bouruc and Alin Stefanescu

Abstract: Accurate and expressive representation of human emotions in 3D models remains a major challenge in various industries, including gaming, film, healthcare, virtual reality and robotics. This work aims to address this challenge by utilizing a new dataset and a set of baseline methods within an open-source framework developed to improve realism and emotional expressiveness in human 3D representations. At the center of this work is the use of a novel and diverse dataset consisting of short video clips showing people mimicking specific emotions: anger, happiness, surprise, disgust, sadness, and fear. The dataset was further processed using state-of-the-art parametric body models that accurately reproduce these emotions. The resulting 3D meshes were then integrated into a generative pose generation model capable of producing similar emotions.
Download

Paper Nr: 226
Title:

Classification of Complaints Text Data by Ensembling Large Language Models

Authors:

Pruthweesha Airani, Neha Pipada and Pratik Shah

Abstract: Effective and efficient management of consumer complaints requires segregation of complaints based on products, services, etc. categories. In this work, we propose an ensemble classification approach based on statistical class incidence frequencies from softmax confidence scores of ensemble of classifiers. The classifiers process the complaint text through Large Language Models (LLMs) followed by discriminating networks. LLMs along with discriminators are fine-tuned on a large, publicly available dataset of over 162,000 annotated consumer complaint records pertaining to banking services. The proposed ensemble approach utilizes confidence scores from individual classifiers (LLM embeddings + discriminator network) achieving better accuracy. It is based on statistical analysis of class-wise precision as a function of confidence score. The individual classifiers built on various SMLMs & LLMs are experimented with, and the results are tabulated for the complaints classification task.
Download

Paper Nr: 227
Title:

Multimodal Stock Price Prediction

Authors:

Furkan Karadaş, Bahaeddin Eravcı and Ahmet Murat Özbayoğlu

Abstract: In an era where financial markets are heavily influenced by many static and dynamic factors, it has become increasingly critical to carefully integrate diverse data sources with machine learning for accurate stock price prediction. This paper explores a multimodal machine learning approach for stock price prediction by combining data from diverse sources, including traditional financial metrics, tweets, and news articles. We capture real-time market dynamics and investor mood through sentiment analysis on these textual data using both ChatGPT-4o and FinBERT models. We look at how these integrated data streams augment predictions made with a standard Long Short-Term Memory (LSTM model) to illustrate the extent of performance gains. Our study's results indicate that incorporating the mentioned data sources considerably increases the forecast effectiveness of the reference model by up to 5%. We also provide insights into the individual and combined predictive capacities of these modalities, highlighting the substantial impact of incorporating sentiment analysis from tweets and news articles. This research offers a systematic and effective framework for applying multimodal data analytics techniques in financial time series forecasting that provides a new perspective for investors to leverage data for decision-making.
Download

Paper Nr: 230
Title:

Towards Personal Assistants for Energy Processes Based on Locally Deployed LLMs

Authors:

Maximilian Orlowski, Emilia Knauff and Florian Marquardt

Abstract: This paper presents a coaching assistant for network operator processes based on a Retrieval-Augmented Generation (RAG) system leveraging open-source Large Language Models (LLMs) as well as Embedding Models. The system addresses challenges in employee onboarding and training, particularly in the context of increased customer contact due to more complex and extensive processes. Our approach incorporates domain-specific knowledge bases to generate precise, context-aware recommendations while mitigating LLM hallucination. We introduce our systems architecture to run all components on-premise in an our own datacenter, ensuring data security and process knowledge control. We also describe requirements for underlying knowledge documents and their impact on assistant answer quality. Our system aims to improve onboarding accuracy and speed while reducing senior employee workload. The results of our study show that realizing a coaching assistant for German network operators is reasonable, when addressing performance, correctness, integration and locality. However current results regarding accuracy do not yet meet the requirements for productive use.
Download

Paper Nr: 234
Title:

GenGUI: A Dataset for Automatic Generation of Web User Interfaces Using ChatGPT

Authors:

Mădălina Dicu, Enol García González, Camelia Chira and José R. Villar

Abstract: The identification of elements in user interfaces is a problem that can generate great interest in current times due to the significant interaction between users and machines. Digital technologies are increasingly used to carry out almost any daily task. Computer vision can be helpful in different applications, such as accessibility, testing, or automatic code generation, to accurately identify the elements that make up a graphical interface. This paper focuses on one problem that affects almost any Deep Learning and computer vision problem, which is the generation and annotation of datasets. Few contributions in the literature provide datasets to train vision models to solve this problem. Moreover, analyzing the literature, most datasets focus on generating images of mobile applications, all in English. In this paper, we propose GenGUI, a new dataset of desktop applications that presents various contents, including multiple languages. Furthermore, this contribution will train different versions of YOLO models using GenGUI to test their quality with reasonably good results.
Download

Paper Nr: 238
Title:

A Hybrid CNN-LSTM Model for Opinion Mining and Classification of Course Reviews

Authors:

Hatem Majouri, Olfa Gaddour and Yessine Hadj Kacem

Abstract: Automatic analysis of online course reviews is a critical task that has garnered significant interest, particularly for improving the quality of e-learning platforms. The challenge lies in accurately classifying user feedback in order to generate actionable insights for educators and learners. In this work, we investigate the effectiveness of a hybrid CNN-LSTM model compared to several state-of-the-art deep learning models, including BERT, LSTM, GRU, and CNN, for analyzing reviews collected from the FutureLearn platform. Our experiments demonstrate that the proposed model achieves superior performance in classifying user reviews, with an accuracy of 0.95. These results highlight the potential of advanced deep learning techniques in extracting meaningful insights from user feedback, offering valuable guidance for course developers and learners.
Download

Paper Nr: 244
Title:

Enhanced Guided Local Search for Addressing the Graph Burning Problem

Authors:

Lamia Sadeg-Belkacem, Imad Tamelghaghet and Fatima Benbouzid-Si Tayeb

Abstract: Information spread is crucial in network science, investigating how influence, data, or contagion propagates through networks. Graph burning offers a simplified deterministic model for addressing the NP-complete Graph Burning Problem. Acknowledging the unique characteristics of this problem, this paper introduces an efficient guided local search approach, leveraging betweenness centrality to initialize the solution process and integrating an augmented function with penalty terms to optimize the burning sequence. Using a binary search mechanism, candidate values are iteratively tested. Experimental results on 15 benchmark graphs demonstrate the algorithm’s superior performance compared to state-of-the-art methods.
Download

Paper Nr: 254
Title:

MOTIF: A Framework for Enhancing the Profiling Module of Generative Agents that Simulate Human Behavior

Authors:

Tibério Cerqueira and Pamela Bezerra

Abstract: Recent advances in Large Language Models (LLMs) have made the development of architectures that convincingly simulate human behavior possible. These architectures give rise to generative agents (GA), a new class of intelligent agents capable of carrying out human activities such as forming opinions, initiating dialogues, and planning the day. These experiences are stored as natural language and later transformed into reflections, which are then used to guide future actions. Some of the advantages of GA are the ability to operate in dynamic and open environments, interact with other agents in a more human-related way, and adapt to changes. These agents, however, require a complex development process. Given this, this study proposes MOTIF, a framework for facilitating and speeding up the initial stage of building these agents, known as profiling. This stage is responsible for defining the agents’ identities and personalities. However, profiling is very subjective and lacks a standard process, with some solutions manually writing each profile, while others use LLMs. MOTIF combines both manual and LLM-based methods to enable the development of agents with well-defined personalities and identities. Additionally, it provides a way of standardizing and formalizing the profiling stage, creating the basis for future research in this field.
Download

Paper Nr: 255
Title:

Federated Machine Learning Framework for Soil Classification in Smart Agriculture

Authors:

Marwen Ghabi, Sofiane Khalfallah and Hela Ltifi

Abstract: In the area of smart agriculture, data management and analysis play a key role in improving agricultural practices. However, the centralization of data poses major challenges in terms of confidentiality, especially due to the sensitivity of information collected from farms. Federated learning addresses these concerns by enabling the training of AI models in a decentralized manner, where data remains localized while sharing only model updates. This approach ensures confidentiality while facilitating collaboration between different data sources. This study presents an innovative solution that combines federated learning with a modular microservices-based architecture to deploy predictive models as a machine learning service. This architecture, consisting of microservices dedicated to data management, local model formation, federated aggregation, and Application Programming Interface (API) delivery, enables real-time predictions to be delivered in a scalable and resilient manner. To illustrate this approach, a case study on soil type classification was conducted. The results show that our method not only preserves the confidentiality of distributed agricultural data, but also improves the accuracy of agricultural recommendations. The integration of federated learning into a microservices architecture represents a significant step forward, offering new perspectives for artificial intelligence in complex environments requiring confidentiality and scalability.
Download

Paper Nr: 258
Title:

On the Quest for an NLP-Driven Framework for Value-Based Decision-Making in Automatic Agent Architecture

Authors:

Alicia Pina-Zapata and Sara García-Rodríguez

Abstract: As automatic agents begin to operate in high-stakes areas like finance and healthcare, the alignment of AI goals with human values becomes increasingly critical, addressing the so-called “alignment problem”. To tackle this challenge, the paper proposes the architecture of a Value-Based autonomous Agent capable of interpreting its environment through the lens of human values and guiding its decission-making processes in accordance with its own values. The agent utilizes a natural language processing (NLP) technique to detect and assess the values associated with various actions, selecting those most aligned with its moral guidelines. The integration of NLP into the agent’s architecture is crucial for enhancing its ability to make autonomous value-aligned decisions, offering a framework for incorporating ethical considerations into AI development.
Download

Paper Nr: 259
Title:

Dimensionality Reduction on the SPD Manifold: A Comparative Study of Linear and Non-Linear Methods

Authors:

Amal Araoud, Enjie Ghorbel and Faouzi Ghorbel

Abstract: The representation of visual data using Symmetric Positive Definite (SPD) matrices has proven effective in numerous computer vision applications. Nevertheless,the non-Euclidean nature of the SPD space poses a challenge, especially when dealing with high-dimensional data. Conventional dimensionality reduction methods have been typically designed for data lying in linear spaces, rendering them theoretically unsuitable for SPD matrices. For that reason, considerable efforts have been made to adapt these methods to the SPD space by leveraging its Riemannian structure. Despite these advances, a systematic comparison of conventional, i.e., linear and revisited, i.e., non-linear dimensionality reduction methods applied to SPD data according to their distribution remains lacking. In fact, while geometry-aware dimensionality reduction methods are highly relevant, the convexity of the SPD space may hinder their performance. This study addresses this gap by evaluating the performance of both linear and non-linear dimensionality reduction techniques within a binary classification scenario. For that purpose, a synthetically generated dataset exhibiting different class distribution configurations (distant, slight overlap, strong overlap) is used. The obtained results suggest that non-linear methods offer limited advantages over linear approaches. According to our analysis, this outcome may be attributed to two primary factors: the convexity of the SPD space and numerical issues.
Download

Paper Nr: 260
Title:

Ensemble of Neural Networks to Forecast Stock Price by Analysis of Three Short Timeframes

Authors:

Ubongabasi Ebong Etim, Vitaliy Milke and Cristina Luca

Abstract: Financial markets are known for complexity and volatility, and predicting the direction of price movement of financial instruments is essential for financial market participants. This paper aims to use neural networks to predict the direction of Apple’s share price movement. Historical stock price data on three Intraday timeframes and technical indicators selected for each timeframe are used to develop and evaluate the performance of various neural network models, including Multilayer Perceptron and Convolutional Neural Networks. This research also highlights the importance of selecting appropriate technical indicators for different timeframes to optimise the performance of the selected neural network models. It showcases the use of neural networks within an ensemble architecture that tracks the directional movement of Apple Inc. share prices by combining upward and downward predictions from the three short timeframes. This approach generates a trading system with buy and sell signals for intraday trading.
Download

Paper Nr: 263
Title:

Investigating the Configurability of LLMs for the Generation of Knowledge Work Datasets

Authors:

Desiree Heim, Christian Jilek, Adrian Ulges and Andreas Dengel

Abstract: The evaluation of support tools designed for knowledge workers is challenging due to the lack of publicly available, extensive, and complete data collections. Existing data collections have inherent problems such as incompleteness due to privacy-preserving methods and lack of contextual information. Hence, generating datasets can represent a good alternative, in particular, Large Language Models (LLM) enable a simple possibility of generating textual artifacts. Just recently, we therefore proposed a knowledge work dataset generator, called KnoWoGen. So far, the adherence of generated knowledge work documents to parameters such as document type, involved persons, or topics has not been examined. However, this aspect is crucial to examine since generated documents should reflect given parameters properly as they could serve as highly relevant ground truth information for training or evaluation purposes. In this paper, we address this missing evaluation aspect by conducting respective user studies. These studies assess the documents’ adherence to multiple parameters and specifically to a given domain parameter as an important, representative. We base our experiments on documents generated with KnoWoGen and use the Mistral-7B-Instruct model as LLM. We observe that in the given setting, the generated documents showed a high quality regarding the adherence to parameters in general and specifically to the parameter specifying the document’s domain. Hence, 75% of the given ratings in the parameter-related experiments received the highest or second-highest quality score which is a promising outcome for the feasibility of generating high-qualitative knowledge work documents based on given configurations.
Download

Paper Nr: 267
Title:

Adjusting Doctor's Reliance on AI Through Labeling for Training Data and Modification of AI Output in a Muscle Tissue Detection Task

Authors:

Keito Miyake, Kumi Ozaki, Akihiro Maehigashi and Seiji Yamada

Abstract: Due to the significant advancements in artificial intelligence(AI), AI technologies are increasingly providing support in various fields. However, even if AI performs at a high level, humans refuse AI for no obvious reason and prefer to solve problems on their own. For instance, experts such as medical professionals tend to be more reluctant to rely on a medical AI’s diagnosis than on a human medical professional. This tendency leads to undertrust in AI and could affect its implementation in society. Thus, this study aims to mitigate the undertrust in AI by providing two functions from the perspective of interaction design: (a) labeling AI outputs as correct or incorrect for training data and (b) modifying AI outputs. To evaluate the effectiveness of these two functions in increasing medical professionals’ reliance on AI, we conducted an experiment involving 25 radiologists and radiographers participating in a muscle-tissue-detection task. A two-way analysis of variance was conducted to analyze their AI-usage rate. The results indicate that both functions statistically increased reliance on AI. Our novel finding is that when radiologists are enabled to control AI output by labeling results as correct or incorrect, their reliance on AI increases.
Download

Paper Nr: 271
Title:

Federated Learning Harnessed with Differential Privacy for Heart Disease Prediction: Enhancing Privacy and Accuracy

Authors:

Wided Moulahi, Tarek Moulahi, Imen Jdey and Salah Zidi

Abstract: The increasing digitization of healthcare raises the concerns surrounding the patients’ privacy. Therefore, the integration of privacy preserving technologies has proven imperative to curb the negative repercussions tied to technology deployment in the medical sector and to provide trustworthy artificial intelligence healthcare applications. Two raising approaches are promoted to the forefront of research and gaining momentum in the realm of healthcare smart systems: Federated Learning and Differential Privacy. On one hand, Federated Learning (FL) enables collaborative model training across multiple institutions without exchanging raw data. Differential Privacy (DP), on the other hand, provides a formal framework for safeguarding data against potential privacy breaches. The application of these approaches in healthcare settings ensures the protection of sensitive patient informations. In this paper, we delve into the challenges posed by medical data to see how FL and DP can be tailored to suit these requirements. We aim to strike a balance between technology deployment in the medical field and privacy preservation. To this end, we developed a Multi-layer Perceptron (MLP) model to predict if a person is at risk to have heart diseases. The model, trained on different medical datasets for heart diseases, reached an accuracy of 99.57%. The same model was trained in FL framework. It achieved a FL averaged accuracy reaching 99.15%. In a third scenario, to enhance clients’ privacy, we deployed a DP framework. The differentially private MLP achieved an accuracy extending to 97.07% in centralized settings and averaged accuracy attaining 89.94% in FL settings, outperforming existing methods in heart diseases prediction.
Download

Paper Nr: 275
Title:

How to Box Your Cells: An Introduction to Box Supervision for 2.5D Cell Instance Segmentation and a Study of Applications

Authors:

Fabian Schmeisser, Maria Caroprese, Gillian Lovell, Andreas Dengel and Sheraz Ahmed

Abstract: Cell segmentation in volumetric microscopic images is a fundamental step towards automating the analysis of life-like representations of complex specimens. As the performance of current Deep Learning algorithms is held back by the lack of accurately annotated ground truth, a pipeline is proposed that produces accurate 3D cell instance segmentation masks solely from slice-wise bounding box annotations. In an effort to further reduce the time requirements for the annotation process, a study is conducted on how to effectively reduce the size of the training set. To this end, three slice-reduction strategies are suggested and evaluated in combination with bounding box supervision. We find that as low as 1% of weakly labeled training data suffices to produce accurate results, and that predictions produced by a 10 times smaller dataset are of equal quality to when the full dataset is exploited for training.
Download

Paper Nr: 278
Title:

UIVLP: An Improved User Interface and Visualization Technique to Visualize Learners’ Performances

Authors:

Mukesh Kumar Rohil and Trishna Paul

Abstract: In most of the educational setups, the grading of students’ performance is based on their relative standing in the class. In this work, we develop and present a user interface to visualize students’ performance, expressed in terms of marks, out of same maximum marks for each subject, scored by the students in various evaluation components for a subject. First, we statistically select three most informative subjects for the whole class and then find the individual student’s average score in all components along with the overall average of whole class for an evaluation component. We assume that the three courses’ performance for which the 3D visualization is required, is either specified by the evaluator or selected by the system basis principal component analysis. The visualization procedures have been developed for both, the individual student and the entire class. The interactive 3D visualization and the bar-graphs can be compared side-by-side and we visually observe that the scatter-plot of clusters provides better insights as compared to the conventional bar-graphs. We also observe that the proposed visualization is better than the bar-graphs basis no-reference BRISQUE image quality assessment. However, there may be certain situations when both types of graphs might be needed.
Download

Paper Nr: 280
Title:

GAIus: Combining Genai with Legal Clauses Retrieval for Knowledge-Based Assistant

Authors:

Michał Matak and Jarosław Chudziak

Abstract: In this paper we discuss the capability of large language models to base their answer and provide proper ref-erences when dealing with legal matters of non-english and non-chinese speaking country. We discuss the history of legal information retrieval, the difference between case law and statute law, its impact on the legal tasks and analyze the latest research in this field. Basing on that background we introduce gAIus, the architecture of the cognitive LLM-based agent, whose responses are based on the knowledge retrieved from certain legal act, which is Polish Civil Code. We propose a retrieval mechanism which is more explainable, human-friendly and achieves better results than embedding-based approaches. To evaluate our method we create special dataset based on single-choice questions from entrance exams for law apprenticeships conducted in Poland. The proposed architecture critically leveraged the abilities of used large language models, improving the gpt-3.5-turbo-0125 by 419%, allowing it to beat gpt-4o and lifting gpt-4o-mini score from 31% to 86%. At the end of our paper we show the possible future path of research and potential applications of our findings.
Download

Paper Nr: 281
Title:

Meta-Ensemble Learning for Multi-Trait Optimization in Maize Breeding: Combining Gradient Boosting, Random Forests, and Deep Learning with SVM Integration

Authors:

Dupuy Rony Charles, Pascal Pultrini and Andrea Tettamanzi

Abstract: Plant breeding aims to enhance traits such as yield, drought tolerance, and disease resistance. Traditional Multi-Trait Selection Indices (MTSI) struggle with high-dimensional genomic data and complex trait interactions. We present a meta-ensemble machine learning framework integrating Gradient Boosting, Random Forest, and Deep Neural Networks (DNNs) with a Support Vector Machine (SVM) meta-model to address these challenges. This meta-ensemble approach leverages the strengths of multiple algorithms for improved predictive accuracy and robustness. Experiments on maize datasets show that our meta-ensemble significantly outperforms traditional MTSI methods and individual machine learning models. The meta-ensemble achieves superior predictive accuracy and operational efficiency, with a marked reduction in mean squared error (MSE) and consistent performance across validation sets. This study advances meta-ensemble machine learning in plant breeding, providing a robust framework for multi-trait selection. Our approach improves trait prediction reliability and sets a new standard in maize breeding, with potential applications in other crop species, enhancing agricultural productivity and resilience.
Download

Paper Nr: 286
Title:

Predicting the State of Health of Supercapacitors Using a Federated Learning Model with Homomorphic Encryption

Authors:

Víctor López, Oscar Fontenla-Romero, Elena Hernández-Pereira, Bertha Guijarro-Berdiñas, Carlos Blanco-Seijo and Samuel Fernández-Paz

Abstract: The increasing prevalence of supercapacitors (SCs) in various industrial sectors underscores the necessity for precise estimation of the state of health (SOH) of these devices. This article presents a novel approach to SOH prediction using a model that integrates federated learning (FL) and homomorphic encryption (HE), FedHEONN. Conventional SOH prediction models face challenges concerning accuracy, reliability, and secure data handling, particularly in Internet of Things (IoT) environments. FedHEONN addresses these issues by using FL to enable a network of distributed nodes to collaboratively develop a predictive model without the need to share private data. This model enhances both data privacy and leverages the collective intelligence of edge computing devices. Furthermore, the inclusion of HE allows computations to be performed on encrypted data, further securing the federated learning framework. We conducted experiments with a real dataset to evaluate the effectiveness of this FL method in predicting the SOH of SCs against conventional models, including linear regression with regularisation techniques such as Lasso, Ridge and Elastic-net, and non-linear models such as multilayer perceptron and support vector machine for regression. The results were tested in various configurations, including empirical mode decomposition (EMD) and multi-stage (MS) setups.
Download

Paper Nr: 294
Title:

Abnormal Predicates: Learning Categorical Defaults from Probabilistic Rules

Authors:

Rose Azad Khan and Vaishak Belle

Abstract: Learning defaults is a longstanding goal in the field of knowledge representation and reasoning. We provide a novel method for learning defaults by way of introducing a new predicate: the abnormal predicate, which explicitly covers all the exceptions to a rule, thus forming a default theory. Our proposed method for learning defaults is sound and complete for all rule-exceptions, and can be extended for use on other frameworks.
Download

Paper Nr: 296
Title:

Semantic Objective Functions: A Distribution-Aware Method for Adding Logical Constraints in Deep Learning

Authors:

Miguel Angel Mendez-Lucero, Enrique Bojorquez Gallardo and Vaishak Belle

Abstract: Issues of safety, explainability, and efficiency are of increasing concern in learning systems deployed with hard and soft constraints. Loss-function based techniques have shown promising results in this area, by embedding logical constraints during neural network training. Through an integration of logic and information geometry, we provide a construction and theoretical framework for these tasks that generalize many approaches. We propose a loss-based method that embeds knowledge—enforces logical constraints—into a machine learning model that outputs probability distributions. This is done by constructing a distribution from the logical formula, and constructing a loss function as a linear combination of the original loss function with the Fisher-Rao distance or Kullback-Leibler divergence to the constraint distribution. This construction is primarily for logical constraints in the form of propositional formulas (Boolean variables), but can be extended to formulas of a first-order language with finite variables over a model with compact domain (categorical and continuous variables), and others statistical models that is to be trained with semantic information. We evaluate our method on a variety of learning tasks, including classification tasks with logic constraints, transferring knowledge from logic formulas, and knowledge distillation.
Download

Paper Nr: 300
Title:

Towards Multi-View Hand Pose Recognition Using a Fusion of Image Embeddings and Leap 2 Landmarks

Authors:

Sergio Esteban-Romero, Romeo Lanzino, Marco Raoul Marini and Manuel Gil-Martín

Abstract: This paper presents a novel approach for multi-view hand pose recognition through image embeddings and hand landmarks. The method integrates raw image data with structural hand landmarks derived from the Leap Motion Controller 2. A Vision Transformer (ViT) pretrained model was used to extract visual features from dual-view grayscale images, which were fused with the corresponding Leap 2 hand landmarks, creating a multimodal representation that encapsulates both visual and landmark data for each sample. These fused embeddings were then classified using a multi-layer perceptron to distinguish among 17 distinct hand poses from the Multi-view Leap2 Hand Pose Dataset, which includes data from 21 subjects. Using a Leave-OneSubject-Out Cross-Validation (LOSO-CV) strategy, we demonstrate that this fusion approach offers a robust recognition performance (F1 Score of 79.33 ± 0.09 %), particularly in scenarios where hand occlusions or challenging angles may limit the utility of single-modality data.
Download

Paper Nr: 310
Title:

Explainable AI: A Retrieval-Augmented Generation Based Framework for Model Interpretability

Authors:

Devansh Guttikonda, Deepika Indran, Lakshmi Narayanan, Tanishka Pasarad and Sandesh B. J.

Abstract: The growing reliance on Machine learning and Deep learning models in industries like healthcare, finance and manufacturing presents a major challenge: the lack of transparency and understanding of how these models make decisions. This paper introduces a novel Retrieval-Augmented Generation (RAG) based framework to tackle this issue. By leveraging Large Language Models (LLMs) and domain-specific knowledge bases, the proposed framework offers clear, interactive explanations of model outputs, making these systems more trustworthy and accessible for non-technical users. The framework’s effectiveness is demonstrated across healthcare, finance and manufacturing, offering a scalable and effective solution that can be applied across industries.
Download

Paper Nr: 312
Title:

Occupant Activity Recognition in IoT-Enabled Buildings: A Temporal HTN Planning Approach

Authors:

Ilche Georgievski

Abstract: Given that people spend most of their time indoors, it is imperative that buildings maintain optimal well-being for occupants. To achieve this, research must prioritise occupants over buildings themselves. IoT-enabled buildings can improve quality of life by understanding and responding to occupant’s behaviour. This requires recognising what occupants are doing based on IoT data, particularly by considering the objects they use in specific building areas. Situated within the realm of plan and goal recognition as planning, we propose a novel knowledge-engineering approach to occupant activity recognitions leveraging temporal HTN planning. Our approach consists of two primary processes: generating problem instances from IoT data and engineering HTN domain models for activity recognition. The first ensures the representation of IoT data using planning constructs, while the second integrates knowledge about occupant activities into HTN domain models. To support our approach, we provide two HTN domain models tailored for workspaces and homes. Experimental validation with the latter domain and a real-world dataset show that the quality of our computed solutions surpasses that of baseline data-driven approaches and is comparable to more advanced, hybrid approaches.
Download

Paper Nr: 315
Title:

Application of Large Language Models and ReAct Prompting in Policy Evidence Collection

Authors:

Yang Zhang and James Pope

Abstract: Policy analysis or formulation often requires evidence-based support to ensure the scientific rigor and rationality of the policy, increase public trust, and reduce risks and uncertainties. However, manually collecting policy-related evidence is a time-consuming and tedious process, making some automated collection methods necessary. This paper presents a novel approach for automating policy evidence collection through large language models (LLMs) combined with Reasoning and Acting (ReAct) prompting. The advantages of our approach lie in its minimal data requirements, while ReAct prompting enables the LLM to call external tools, such as search engines, ensuring real-time evidence collection. Since this is a novel problem without existing methods for comparison, we relied on human experts for ground truth and baseline comparison. In 50 experiments, our method successfully collected correct policy evidence 36 times using GPT-3.5. Furthermore, with more advanced models such as GPT-4o, the improved understanding of prompts and context enhances our method’s efficiency. Finally, our method using GPT-4o successfully gathered correct evidence 45 times in 50 experiments. Our results demonstrate that, using our method, policy researchers can effectively gather evidence to support policy-making.
Download

Paper Nr: 322
Title:

Environment Descriptions for Usability and Generalisation in Reinforcement Learning

Authors:

Dennis J. N. J. Soemers, Spyridon Samothrakis, Kurt Driessens and Mark H. M. Winands

Abstract: The majority of current reinforcement learning (RL) research involves training and deploying agents in environments that are implemented by engineers in general-purpose programming languages and more advanced frameworks such as CUDA or JAX. This makes the application of RL to novel problems of interest inaccessible to small organisations or private individuals with insufficient engineering expertise. This position paper argues that, to enable more widespread adoption of RL, it is important for the research community to shift focus towards methodologies where environments are described in user-friendly domain-specific or natural languages. Aside from improving the usability of RL, such language-based environment descriptions may also provide valuable context and boost the ability of trained agents to generalise to unseen environments within the set of all environments that can be described in any language of choice.
Download

Paper Nr: 324
Title:

An Evaluation of ChatGPT’s Reliability in Generating Biographical Text Outputs

Authors:

Kehinde Oloyede, Cristina Luca and Vitaliy Milke

Abstract: The rapid evolution of large language models has transformed the landscape of Artificial Intelligence-based applications, with ChatGPT standing out for generating text that feels human-like. This study aims to assess ChatGPT’s reliability and consistency when creating biographical texts. The paper focuses on evaluating how precise, consistent, readable and contextually appropriate the model’s biographical outputs are, taking into account various interactions and inputs. The input consisting of a biographical text dataset, specific rules and a prompt was used in an extensive experimentation with ChatGPT. The model’s performance was assessed using both quantitative and qualitative measures, scrutinising how well it maintains consistency across different biographical scenarios. This paper shows how greater coherence and accuracy in text generation can be achieved by creating detailed and structured directives. The significance of this study extends beyond its technical aspects, as accurate and reliable biographical data is essential for record-keeping and historical preservation.
Download

Paper Nr: 325
Title:

Using LLM-Based Deep Reinforcement Learning Agents to Detect Bugs in Web Applications

Authors:

Yuki Sakai, Yasuyuki Tahara, Akihiko Ohsuga and Yuichi Sei

Abstract: This paper presents an approach to automate black-box GUI testing for web applications by integrating deep reinforcement learning (DRL) with large language models (LLMs). Traditional GUI testing is often inefficient and costly due to the difficulty in generating comprehensive test scenarios. While DRL has shown potential in automating exploratory testing by leveraging GUI interaction data, such data is browser-dependent and not always accessible in web applications. To address this challenge, we propose using LLMs to infer interaction information directly from HTML code, incorporating these inferences into the DRL’s state representation. We hypothesize that combining the inferential capabilities of LLMs with the robustness of DRL can match the accuracy of methods relying on direct data collection. Through experiments, we demonstrate that LLM-inferred interaction information effectively substitutes for direct data, enhancing both the efficiency and accuracy of automated GUI testing. Our results indicate that this approach not only streamlines GUI testing for web applications but also has broader implications for domains where direct state information is hard to obtain. The study suggests that integrating LLMs with DRL offers a promising path toward more efficient and scalable automation in GUI testing.
Download

Paper Nr: 326
Title:

Collision Avoidance and Return Manoeuvre Optimisation for Low-Thrust Satellites Using Reinforcement Learning

Authors:

Alexandru Solomon and Ciprian Paduraru

Abstract: Collision avoidance is an essential aspect of day-to-day satellite operations, enabling operators to carry out their missions safely despite the rapidly growing amount of space debris. This paper presents the capabilities of reinforcement learning (RL) approaches to train an agent capable of collision avoidance manoeuvres for low-thrust satellites in low-Earth orbit. The collision avoidance process performed by the agent consists of optimizing a collision avoidance manoeuvre as well as the return manoeuvre to the original orbit. The focus is on satellites with low thrust propulsion systems, since the optimization process of a manoeuvre performed by such a system is more complex than for an impulsive system and therefore more interesting to be solved by RL methods. The training process is performed in a simulated environment of space conditions for a generic satellite in LEO subjected to a collision from different directions and with different velocities. This paper presents the results of agents trained with RL in training scenarios as well as in previously unknown situations using different methods such as DQN, REINFORCE, and PPO.
Download

Paper Nr: 329
Title:

An Empirical Study Using Machine Learning to Analyze the Relationship Between Musical Audio Features and Psychological Stress

Authors:

Harini Anand, Shalini Kammalam Srinivasan, Hasika Venkata Boggarapu, Arti Arya and Richa Sharma

Abstract: Music plays a vital role in regulating emotions and mental well-being, influencing brain function and stress levels. This study leverages Explainable AI (XAI) techniques, specifically SHapley Additive exPlanations (SHAP) and Integrated Gradients, to analyze the impact of scientifically backed audio features—such as Danceability, Energy, Acousticness, etc on stress classification. Using a Feedforward Neural Network, we achieved a 0.96 accuracy in categorizing music preferences into ”Stressed,” ”Not-stressed,” and ”Borderline” states. The classifier operates effectively across languages and genres, enhancing its versatility for detecting Psychological Stress by providing interpretable insights.
Download

Paper Nr: 343
Title:

Bringing NL Back with P: Defending Linguistic Methods in NLP for Future AI Applications

Authors:

Fabio Meroni

Abstract: In the development of Artificial Intelligence (AI) tools for Natural Language Processing (NLP) applications, this position paper promotes the (re)introduction of rule-based, linguistically informed methodologies, with particular attention to addressing the challenges posed by low-resource languages and research ethics when it comes to the enhancement of machine intelligence by means of linguistic intelligence. NLP, as a rapidly evolving subfield of AI, has seen a proliferation of contributions in recent years. However, the predominant reliance on statistically driven approaches has reduced NLP to a pursuit of superficial aesthetic results, neglecting the foundational linguistic structures that underpin natural language processing. Consequently, the marginalization of linguists within the field has stalled progress toward a deeper understanding of Natural Language Formalization (NLF). Without targeted intervention, these issues threaten to persist, undermining the potential of NLP to achieve its full intellectual and practical promise. This paper argues for a renewed integration of the science of natural language (NL) into its processing (P) within an interdisciplinary framework that emphasizes collaboration between computational linguists and AI researchers, and presents a methodological proposition of a possible way to include linguistic resources in a richly informed AI application using NooJ.
Download

Paper Nr: 349
Title:

Combining Petri Nets and AI Techniques to Improve Dynamic Production Scheduling Optimization

Authors:

Salah Hammedi and Haythem Chniti

Abstract: This paper introduces an intelligent scheduling approach that integrates Petri nets and AI techniques to optimize real-time production in reconfigurable manufacturing systems (RMS) under uncertainty. Addressing key challenges such as resource allocation, downtime reduction, and dynamic adaptability, our method achieves an 85% success rate. By leveraging historical data, machine learning, and expert systems, it enhances throughput and minimizes idle time. Comparative analysis demonstrates that our approach outperforms existing static and dynamic methods, offering continuous adaptability to evolving conditions and superior resource allocation. These advancements establish a scalable framework for efficient and agile scheduling, setting a new standard for dynamic manufacturing environments.

Paper Nr: 354
Title:

Mitigating Algorithmic Bias in Prostate Cancer Risk Stratification with Responsible Artificial Intelligence and Machine Learning

Authors:

Meghana Kshirsagar, Mihir Sontakke, Gauri Vaidya, Ahmad Alkhan, Aideen Killeen and Conor Ryan

Abstract: Prostate cancer (PCa) is the second most prevalent cancer among men worldwide, the majority affecting those over the age of 65. The Gleason Score (GS) remains the gold standard for diagnosing clinically significant prostate cancer (csPCa); however, traditional biopsy can lead to patient discomfort. Algorithmic bias in medical diagnostic models remains a critical challenge, impacting model reliability and generalizability across diverse patient populations. This study explores the potential of Machine Learning (ML) models—Logistic Regression (LR) and multiple DL models—as non-invasive alternatives for predicting the GS using Prostate Imaging Cancer AI challenge dataset . To the best of our knowledge, this is the first attempt to use two modalities with this dataset for risk stratification. We developed a LR model, excluding biopsy-derived features like GS, to predict clinically significant prostate cancer, alongside an image triage approach with convolutional neural networks to reduce biases in the ML workflow. Preliminary results from LR and ResNet50, showed test accuracies of 69.79% and 60%, respectively. These findings demonstrate the potential for explainable, trustworthy, and responsible risk stratification enhancing the robustness and generalizability of the prostate cancer risk stratification model.
Download

Paper Nr: 356
Title:

Proposal of an Automated Testing Method for GraphQL APIs Using Reinforcement Learning

Authors:

Kenzaburo Saito, Yasuyuki Tahara, Akihiko Ohsuga and Yuichi Sei

Abstract: GraphQL is a new query language for APIs that has a different structure from the commonly used REST API, making it difficult to apply conventional automated testing methods. This necessitates new approaches. This study proposes GQL-QL, an automated testing method for GraphQL APIs using reinforcement learning. The proposed method uses Q-learning to explore the test space. It generates requests by selecting API fields and arguments based on the schema and updates Q-values according to the response. By repeating this process and learning from it, efficient black-box testing is achieved. Experiments were conducted on publicly available APIs to evaluate the effectiveness of the proposed method using schema coverage and error response rate as metrics. The results showed that the proposed method outperformed existing methods on both metrics.
Download

Paper Nr: 357
Title:

Fish Catch Prediction by Combining Fishing, Weather and Tidal Data

Authors:

Tomohiro Tanaka, Yasuyuki Tahara, Akihiko Ohsuga and Yuichi Sei

Abstract: This study presents a model designed to predict days with increased probabilities of fish catches for inexperienced anglers by utilizing weather and tidal data. Specifically, the study pre-processed catch data, together with meteorological and tidal data from the Japan Meteorological Agency, to consider different fish species. The study applied feature engineering techniques, incorporating lag features and moving average features. Comparative evaluations were conducted against a baseline model that neither accounts for fish species nor includes lag and moving average features. The proposed method exhibited superior performance across all evaluation metrics compared to the baseline model. Specifically, the proposed method achieved a Root Mean Squared Error (RMSE) of 4.36 compared to the baseline's 5.47, a Mean Absolute Error (MAE) of 3.02 versus 4.16, an R² score of 0.20 compared to -0.27, a Mean Absolute Percentage Error (MAPE) of 74.6% versus 133.0%, and a Median Absolute Error (Median AE) of 2.04 compared to 3.33. These improvements highlight the effectiveness of the proposed model in enhancing predictive accuracy and reliability.
Download

Paper Nr: 358
Title:

Vision-Language Models for E-commerce: Detecting Non-Compliant Product Images in Online Catalogs

Authors:

Maciej Niemir, Dominika Grajewska and Bartłomiej Nitoń

Abstract: This study explores the use of vision-language models (VLMs) for automated validation of product images in e-commerce, aiming to ensure visual consistency and accuracy without the need for extensive data annotation and specialized training. We evaluated two VLMs, LLaVA and Moondream2, to determine their effectiveness in classifying images based on suitability for online display, focusing on aspects such as visibility and representational clarity. Each model was tested with varying textual prompts to assess the impact of query phrasing on predictive accuracy. Moondream2 outperformed LLaVA in both precision and processing speed, making it a more practical solution for large-scale e-Commerce applications. Its high specificity and negative predictive value (NPV) highlight its effectiveness in identifying non-compliant images. Our results suggest that VLMs like Moondream2 provide a viable approach to visual validation in e-Commerce, offering benefits in scalability and implementation efficiency, particularly where a rapid and reliable assessment of product imagery is critical. This research demonstrates the potential of VLMs as effective alternatives to traditional image validation methods, underscoring their role in enhancing the quality of the digital catalog.
Download

Paper Nr: 370
Title:

Stacked Ensemble Deep Learning for Robust Intrusion Detection in IoT Networks

Authors:

Marwa Amara, Nadia Smairi and Mohamed Jaballah

Abstract: Intrusion Detection Systems (IDS) are critical for addressing the growing complexity of cyber threats in the Internet of Things (IoT) domain. This paper introduces a novel stacked ensemble approach combining Convolutional Neural Networks (CNN), Temporal Convolutional Networks (TCN), and Long Short-Term Memory (LSTM) models through a logistic regression meta-model. The proposed approach leverages the distinct strengths of each classifier; sequential pattern recognition by LSTMs, temporal dependency modeling by TCNs, and spatial feature extraction by CNNs to create a robust and reliable detection framework. To address the class imbalance problem, we applied various balancing techniques, including Oversampling, Undersam-pling, and a hybrid Meet-in-the-Middle method. The effectiveness of the approach is demonstrated on the CICIDS2017 dataset, achieving an accuracy of 99.99% and an F1-score of 100% with Oversampling, and 99.93% accuracy with the Meet-in-the-Middle technique.
Download

Paper Nr: 372
Title:

Neural Architecture Search: Tradeoff Between Performance and Efficiency

Authors:

Tien Dung Nguyen, Nassim Mokhtari and Alexis Nédélec

Abstract: Many Neural Architecture Search (NAS) methods have designed models that outperform manually configured networks on various tasks. Due to computational cost of model’s training, recent trend includes performing NAS without training candidate networks in the process. Many such methods have proven that training-free metrics are an effective way to assess model’s performance, especially if they are combined together. Multi-training-free-objectives NAS methods usually construct a Pareto front that gives a wide range of solutions. However, only one solution is chosen in the end. We introduce the Rank-based Improved Firefly Algorithm (RB-IFA), which focuses the search in a single direction by converting multiple objective ranks into one weighted sum. Weights are derived from a performance-efficiency tradeoff. Our search algorithm is based on an Improved Firefly Algorithm (IFA). IFA effectively explores the NAS landscape by combining the Firefly Algorithm, which has fast convergence, with a genetic algorithm, which improves the ability to overcome local optima. RB-IFA NAS identifies highly efficient architectures with competitive performance within 8 minutes. These results highlight the potential of multi-training-free metrics and a rank-based approach in finding efficient neural networks.
Download

Paper Nr: 373
Title:

Dialogue Support Through the Identification of Utterances Crucial for the Listener's Interpretation if Missed

Authors:

Kenshin Nakanishi, Tomoyuki Maekawa and Michita Imai

Abstract: When a listener becomes distracted and misses an important utterance, it can hinder their understanding of the conversation and their subsequent responses. In this study, we developed a chat system that simulates the impact of missed important utterances using an algorithm that identifies contextually significant dialogue, which we have been researching previously. The system assesses whether each user utterance contains important context and, if so, notifies the user to alert them of the possibility of misunderstanding by the other party. The results showed that when important utterances were missed, the listener often misunderstood the flow of the conversation. However, the effectiveness of the assistance that alerts users to potential misunderstandings varied depending on the case, and it became clear that the benefits of this feature in a chat system are limited.
Download

Paper Nr: 378
Title:

Leveraging Deep Q-Network Agents with Dynamic Routing Mechanisms in Convolutional Neural Networks for Enhanced and Reliable Classification of Alzheimer’s Disease from MRI Scans

Authors:

Jolanta Podolszanska

Abstract: With limited data and complex image structures, accurate classification of medical images remains a significant challenge in AI-assisted diagnostics. This study presents a hybrid CNN model with a capsule network layer and dynamic routing mechanism, enhanced with a Deep Q-network (DQN) agent, for MRI image classification in Alzheimer’s disease detection. The approach combines a capsule network that captures complex spatial patterns with dynamic routing, improving model adaptability. The DQN agent manages the weights and optimizes learning by interacting with the evolving environment. Experiments conducted on popular MRI datasets show that the model outperforms traditional methods, significantly improving classification accuracy and reducing misclassification rates. These results suggest that the approach has great potential for clinical applications, contributing to the accuracy and reliability of automated diagnostic systems.
Download

Paper Nr: 390
Title:

Synthetic Data Generation for Emergency Medical Systems: A Systematic Comparison of Tabular GAN Extensions

Authors:

Md Faisal Kabir, Md Majharul Islam Nayem and Sven Tomforde

Abstract: The generation of synthetic medical data has gained significant attention due to privacy concerns and the limited availability of real medical datasets. Various methods and techniques have been employed across domains to address these challenges, especially for tabular data. This study presents a comparative analysis of multiple generative models and privacy concerns. In addition, we propose the WLSTM-GAN model, which is evaluated with and without privacy constraints specifically for three medical tabular datasets. Our model is designed to handle both categorical and continuous features independently, incorporating a single generator with two specialized LSTM networks, as well as two distinct discriminators tailored for continuous and categorical data. We demonstrate that LSTM-based architectures can be effectively adapted for tabular data generation, with our WLSTM-GAN outperforming several existing models in fidelity and privacy preservation.
Download

Paper Nr: 391
Title:

Morphological Disambiguation of Texts Based on Analogical Proportions

Authors:

Bilel Elayeb, Myriam Bounhas and Mohamed Firas Ettih

Abstract: The Arabic language is known for its complexity, which encompasses extensive morphological and orthographic variations, as well as significant syntactic and semantic diversity. These unique characteristics often result in morphological ambiguity in Arabic. In this paper, we tackle the challenge of morphological disambiguation in Arabic texts. We frame this task as a classification problem, where the possible values of morphological features represent the classes, and a classification algorithm is used to assign the appropriate class to each word based on its context. Specifically, we investigate the effectiveness of an analogy-based classifier for morphological disambiguation in Arabic texts. Analogical Proportions (AP) are statements that express the relationship between four elements A, B, C, and D such that "A differs from B as C differs from D". Leveraging Analogical Proportions-based inference, the AP classifier predicts the fourth, unknown element (D), given that the first three (A, B, and C) are known. We evaluate this analogical classifier using a corpus of Classical Arabic texts. The average disambiguation rate (74.80%) of the AP classifier outperforms that of a set of well-established machine-learning and deep learning-based classifiers.
Download

Paper Nr: 407
Title:

Enhanced YOLOv8 Framework for Early Detection of Alzheimer's Disease Using MRI Scans

Authors:

Safa Jraba, Mohamed Elleuch, Hela Ltifi and Monji Kherallah

Abstract: Alzheimer's disease is characterized by a progressive neurodegenerative disorder, often misdiagnosed too late, with early symptoms that are hidden. Detection is crucial for effective treatment and slowing the progression of disease. We propose an upgraded version of the YOLO (You Only Look Once) framework, namely YOLOv8, for detecting Alzheimer's disease from MRI scans. Our approach seeks the detection of early structural changes in the brain, most particularly in the hippocampus and cortex, which are also among the first areas affected in this disease process. The framework performs state-of-the-art detection of Alzheimer's changes with a 96% precision via multi-scale feature extraction specifically designed for neuroimaging data. Results show this approach to be exceptionally effective in improving sensitivity and precision over existing techniques, marking it as a highly reliable method for early diagnosis of Alzheimer's disease.
Download

Paper Nr: 408
Title:

Generating Safe Policies for Multi-Agent Path Finding with Temporal Uncertainty

Authors:

Jiří Švancara, David Zahrádka, Mrinalini Subramanian, Roman Barták and Miroslav Kulich

Abstract: Multi-Agent path finding (MAPF) deals with the problem of finding collision-free paths for a group of mobile agents moving in a shared environment. In practice, the duration of individual move actions may not be exact but rather spans in a given range. Such extension of the MAPF problem is called MAPF with Temporal Uncertainty (MAPF-TU). In this paper, we propose a compilation-based approach to generate safe agents’ policies solving the MAPF-TU problem. The policy guarantees that each agent reaches its destination without collision with other agents provided that all agents move within their predefined temporal uncertainty range. We show both theoretically and empirically that using policies rather than plans is guaranteed to solve more types of instances and find a better solution.
Download

Paper Nr: 409
Title:

Towards Trustworthy AI in Demand Planning: Defining Explainability for Supply Chain Management

Authors:

Ruiqi Zhu, Cecilie Christensen, Bahram Zarrin, Per Bækgaard and Tommy Sonne Alstrøm

Abstract: Artificial intelligence is increasingly essential in supply chain management, where machine learning models improve demand forecasting accuracy. However, as AI usage expands, so does the complexity and opacity of predictive models. Given the significant impact on operations, it is crucial for demand planners to trust these forecasts and the decisions derived from them, highlighting the need for explainability. This paper reviews prominent definitions of explainability in AI and proposes a tailored definition of explainability for supply chain management. By using a user-centric approach, we address the practical needs of definitions of explainability for non-technical users. This domain-specific definition aims to support the future development of interpretable AI models that enhance user trust and usability in demand planning tools.
Download

Paper Nr: 420
Title:

Applying Informer for Option Pricing: A Transformer-Based Approach

Authors:

Feliks Bańka and Jarosław A. Chudziak

Abstract: Accurate option pricing is essential for effective trading and risk management in financial markets, yet it remains challenging due to market volatility and the limitations of traditional models like Black-Scholes. In this paper, we investigate the application of the Informer neural network for option pricing, leveraging its ability to capture long-term dependencies and dynamically adjust to market fluctuations. This research contributes to the field of financial forecasting by introducing Informer’s efficient architecture to enhance prediction accuracy and provide a more adaptable and resilient framework compared to existing methods. Our results demonstrate that Informer outperforms traditional approaches in option pricing, advancing the capabilities of data-driven financial forecasting in this domain.
Download

Paper Nr: 426
Title:

Leveraging Capsule Networks for Robust Brain Tumor Classification and Detection in MRI Scans

Authors:

Sandeep Shiraskar, Simon Vellandurai and Dominick Rizk

Abstract: Brain tumors are life-threatening conditions where early detection and accurate classification are critical for timely and effective treatment. Misclassification or delayed identification of tumors can result in fatal consequences. Current deep learning techniques, predominantly based on Convolutional Neural Networks (CNNs), have demonstrated success in tumor detection but face limitations due to their inability to handle diverse and extensive datasets effectively. Moreover, CNNs suffer from information loss in pooling layers, leading to suboptimal performance in capturing global dependencies in MRI tumor images. To overcome these challenges, we propose the use of a modified Capsule Network to address the limitations of CNNs. Capsule Networks retain spatial hierarchies and dependencies, enabling improved performance in tumor detection and classification tasks. Our approach achieves near-perfect classification accuracy across four classes—pituitary, glioma, meningioma, and no tumor—using a diverse and augmented dataset. The dataset comprises publicly available MRI images from Figshare, Sartaj, and Br35 collections, providing a robust platform for evaluating model performance. Experimental results demonstrate that our method not only achieves superior accuracy compared to existing techniques but also maintains its performance across a broader range of data. These findings highlight the potential of Capsule Networks as a reliable and effective solution for brain tumor classification tasks, paving the way for advancements in medical imaging and diagnostic technologies.
Download

Paper Nr: 442
Title:

SPEAR: SPADE-Net and HuBERT for Enhanced Audio-to-Facial Reconstruction

Authors:

Xuan-Nam Cao and Minh-Triet Tran

Abstract: Generating talking faces has become an essential area of research due to its broad applications. Previous studies in facial synthesis have faced challenges in maintaining consistency between input landmarks and generated facial images, especially when dealing with complex expressions or pose variations. To address these challenges, this paper proposes a novel generative approach for face synthesis driven by audio, pose, and reference images. The proposed system combines a pretrained Variational Autoencoder (VAE), Transformer encoders, SPADE (Spatially Adaptive Normalization) modules, and optical flow-based warping to generate realistic facial images. The system utilizes HuBERT for audio feature extraction, a pose encoder for capturing pose-driven features, and a reference encoder to provide contextual facial information. The generated face, incorporating audio cues, pose variations, and reference images, is refined through optical flow to align with the driven pose and landmarks, ensuring high fidelity and natural facial animation. Experimental results demonstrate the effectiveness of this system in generating high quality, emotion driven facial animations.
Download

Paper Nr: 444
Title:

Image2Life: A Model for 3D Mesh Reconstruction from a Single-Image

Authors:

Lynda Ayachi and Mohamed Rabia Benarbia

Abstract: Reconstructing 3D models from a single 2D image is a complex yet fascinating challenge with applications in areas like computer vision, robotics, and augmented reality. In this work, we propose a novel approach to tackle this problem, focusing on creating accurate and detailed 3D representations from minimal input. Our model combines advanced deep learning techniques with geometry-aware methods to extract and translate meaningful features from 2D images into 3D shapes. By introducing a new framework for feature extraction and a carefully designed decoding architecture, our method captures intricate details and improves the overall reconstruction quality. We tested the model extensively on well-known datasets, and the results show significant improvements compared to existing methods in terms of accuracy and reliability.
Download

Paper Nr: 450
Title:

Detecting Misleading Information with LLMs and Explainable ASP

Authors:

Quang-Anh Nguyen, Thu-Trang Pham, Thi-Hai-yen Vuong, Van-Giang Trinh and Nguyen Ha Thanh

Abstract: Answer Set Programming (ASP) is traditionally constrained by predefined rule sets and domains, which limits the scalability of ASP systems. While Large Language Models (LLMs) exhibit remarkable capabilities in linguistic comprehension and information representation, they are limited in logical reasoning which is the notable strength of ASP. Hence, there is growing research interest in integrating LLMs with ASP to leverage these abilities. Although many models combining LLMs and ASP have demonstrated competitive results, issues related to misleading input information which directly affect the incorrect solutions produced by these models have not been adequately addressed. In this study, we propose a method integrating LLMs with explainable ASP to trace back and identify misleading segments in the provided input. Experiments conducted on the CLUTRR dataset show promising results, laying a foundation for future research on error correction to enhance the accuracy of question-answering models. Furthermore, we discuss current challenges, potential advancements, and issues related to the utilization of hybrid AI systems.
Download

Paper Nr: 461
Title:

Robust Skin Lesion Segmentation Approach Combining YOLOv8 and Level-Set Techniques

Authors:

Mariem Jendoubi, Dorsaf Hmida, Mohamed Amine Mezghich and Slim Mhiri

Abstract: Accurate skin lesion segmentation is vital for early diagnosis and treatment in dermatology. While traditional active contour models like Chan-Vese handle noise and poor edge contrast well, they often struggle with anatomical inhomogeneity. This paper introduces a hybrid segmentation approach that combines YOLOv8 with the Chan-Vese model and integrates the weight matrix of Fuzzy C-Means (FCM) to guide the evolution of level set contours. YOLOv8 provides initial segmentation masks using its object detection capabilities, which are refined by the Chan-Vese model for precise boundary delineation. This method integrates the strengths of deep learning for initialization and mathematical modeling for refinement. Experiments on ISIC 2016, ISIC 2017 and ISIC 2018 datasets validate the effectiveness of the proposed approach, achieving high accuracy, robustness, and computational efficiency.

Paper Nr: 465
Title:

Towards AI-Enabled Model-Driven Architecture: Systematic Literature Review

Authors:

Zina Zammel, Mouna Rekik, Lotfi Souifi and Ismael Bouassida Rodriguez

Abstract: The convergence of two separate areas of computer science, like Model Driven Architecture and Artificial Intelligence, can lead to collaboration in two main ways, such as AI-driven MDA and MDA for AI. In this paper, we present a Systematic Literature Review (SLR) on the application of AI within MDA. Additionally, we examine how AI facilitates transformations between the Computation Independent Model (CIM), Platform Independent Model (PIM), and Platform Specific Model (PSM), highlighting methods that bridge conceptual models with technical specifications. This review contributes to a deeper understanding of AI’s role in enhancing the effectiveness of MDA frameworks by analyzing existing studies that are selected using SLR. Based on a systematic search of IEEE, Science Direct, Springer, ACM, and google scholarrelevant articles published between 2018 and 2024 were identified. The adoption of AI introduces numerous benefits to software engineering, including enhanced support for designers and automation in model transformations.
Download

Paper Nr: 477
Title:

Communication and Negotiation to Improve Agent-Based Models

Authors:

Alejandro Rodríguez-Arias, Noelia Sánchez-Maroño and Bertha Guijarro-Berdiñas

Abstract: Agent-based models (ABM) play a fundamental role in studying and modeling complex real-world systems, primarily relying on reactive agents. Despite their simplicity, the interactions between agents and their environment enable the simulation of diverse systems, contributing to their widespread adoption, particularly in the social sciences. Similarly, though distinct in purpose, multi-agent systems (MAS) are designed to tackle complex, diverse, and distributed problems by leveraging communication, negotiation, and coordination capabilities. Both types of approaches have been used successfully in numerous areas; the power of ABM lies in thousands of interacting agents, while MAS usually employs a smaller number of agents with more capabilities. Including MAS agents’ capabilities in ABM agents allows the generation of more realistic simulations that aid in the study of the modeled systems. In this paper, we present a generic ABM model whose agents possess more capabilities, such as communication and negotiation, allowing this enhanced ABM to address more complex modeling problems. To exemplify the usefulness of this enhanced ABM, we propose to use it as a sandbox-tool to test “case-if” scenarios in a model that studies the evolution of a society’s opinion on a given subject, specifically in this example, the implantation of superblocks in the city of Vitoria-Gasteiz (Spain).
Download

Paper Nr: 478
Title:

Effectiveness of Whisper’s Fine-Tuning for Domain-Specific Use Cases in the Industry

Authors:

Daniel Pawlowicz, Jule Weber and Claudia Dukino

Abstract: The integration of Speech-to-Text (STT) technology has the potential to enhance the efficiency of industrial workflows. However, standard speech models demonstrate suboptimal performance in domain-specific use cases. In order to gain user trust, it is essential to ensure accurate transcription, which can be achieved through the fine-tuning of the model to the specific domain. OpenAI’s Whisper was selected as the initial model and subsequently fine-tuned with domain-specific real-world recordings. The fine-tuned model outperforms the initial model in terms of transcription of technical jargon, as evidenced by the results of the study. The fine-tuned model achieved a validation loss of 1.75 and a Word Error Rate (WER) of 1. In addition to improving accuracy, this approach addresses the challenges of noisy environments and speaker variability that are common in real-world industrial environments. The present study demonstrates the efficacy of fine-tuning the Whisper model to new vocabulary with technical jargon, thereby underscoring the value of model adaptation for domain-specific use cases.
Download

Paper Nr: 484
Title:

Towards a Meaningful Communication and Model Aggregation in Federated Learning via Genetic Programming

Authors:

Elia Pacioni, Francisco Fernández De Vega and Davide Calvaresi

Abstract: Federated Learning (FL) enables collaborative training of machine learning models while preserving client data privacy. However, its conventional client-server paradigm presents two key challenges: (i) communication efficiency and (ii) model aggregation optimization. Inefficient communication, often caused by transmitting low-impact updates, results in unnecessary overhead, particularly in bandwidth-constrained environments such as wireless or mobile networks or in scenarios with numerous clients. Furthermore, traditional aggregation strategies lack the adaptability required for stable convergence and optimal performance. This paper emphasizes the distributed nature of FL clients (agents) and advocates for local, autonomous, and intelligent strategies to evaluate the significance of their updates—such as using a “distance” metric relative to the global model. This approach improves communication efficiency by prioritizing impactful updates. Additionally, the paper proposes an adaptive aggregation method leveraging genetic programming and transfer learning to dynamically evolve aggregation equations, optimizing the convergence process. By integrating insights from multi-agent systems, the proposed approach aims to foster more efficient and robust frameworks for decentralized learning.
Download

Paper Nr: 493
Title:

Limitations of Tokenizers for Building a Neuro-Symbolic Lexicon

Authors:

Hilton Alers-Valentín, José D. Maldonado-Torres and J. Fernando Vega-Riveros

Abstract: Tokenization is a critical preprocessing step in natural language processing (NLP), as it determines the units of text that will be analyzed. Conventional tokenization strategies, such as whitespace-based or frequency-based methods, often fail to preserve linguistically meaningful units, including multi-word expressions, phrasal verbs, and morphologically complex tokens. Such failures result in downstream processing errors and hinder parsing performance. This paper examines contemporary tokenization approaches and their limitations in light of foundational concepts in morphology that are relevant for natural language parsing. We then proceed to describe the required features for the cognitive modeling of a human language lexicon and introduce a linguistically aware encoding pipeline. Finally, a preliminary assessment of this system will be presented and major points of the proposed system will be summarized in the conclusions.
Download

Paper Nr: 501
Title:

Making Reinforcement Learning Safer via Curriculum Learning

Authors:

Kelvin Toonen and Thiago D. Simão

Abstract: The growth of deep reinforcement learning gives rise to safety concerns about applications using reinforcement learning. Therefore, it is crucial to investigate the safety aspect in this field, especially in the domain of robotics where agents can break surrounding objects or themselves. Curriculum learning has the potential to help with creating safer agents, because it helps the agent with learning faster and it allows for the agent to learn in safer and more controlled environments leading up to the target environment. More specifically, we change the environment only slightly to make it easier to transfer knowledge from one environment to the next, while still influencing the exploration process of the agent. This project combines curriculum learning with constrained reinforcement learning, a specific form of incorporating safety, to create a framework that allows agents to learn safely, even during training. This framework is also extended to include automation of the curriculum.
Download

Paper Nr: 503
Title:

Lexical Substitution is not Synonym Substitution: On the Importance of Producing Contextually Relevant Word Substitutes

Authors:

Juraj Vladika, Stephen Meisenbacher and Florian Matthes

Abstract: Lexical Substitution is the task of replacing a single word in a sentence with a similar one. This should ideally be one that is not necessarily only synonymous, but also fits well into the surrounding context of the target word, while preserving the sentence’s grammatical structure. Recent advances in Lexical Substitution have leveraged the masked token prediction task of Pre-trained Language Models to generate replacements for a given word in a sentence. With this technique, we introduce CONCAT, a simple augmented approach which utilizes the original sentence to bolster contextual information sent to the model. Compared to existing approaches, it proves to be very effective in guiding the model to make contextually relevant predictions for the target word. Our study includes a quantitative evaluation, measured via sentence similarity and task performance. In addition, we conduct a qualitative human analysis to validate that users prefer the substitutions proposed by our method, as opposed to previous methods. Finally, we test our approach on the prevailing benchmark for Lexical Substitution, CoInCo, revealing potential pitfalls of the benchmark. These insights serve as the foundation for a critical discussion on the way in which Lexical Substitution is evaluated.
Download

Paper Nr: 12
Title:

Intelligent Human Iris Recognition System Based on Deep Learning Models

Authors:

Andreea Negoiţescu

Abstract: This research paper presents the development of an intelligent biometric system which performs human iris recognition. The software application that incorporates it is called KEYE. Deep learning models are implemented to segment and recognize the users’ irises at authentication. Iris segmentation uses a modified version of the U-Net convolutional neural network, trained and validated on images from the I-SOCIAL-DB dataset. The experimental results prove a maximum validation accuracy of 98.98% and a Dice score of 0.93. The extraction of features from the segmented images is done using part of the layers of the pre-trained DenseNet-201 neural network. For classification, the KEYE-DB dataset with visible light spectrum images was created. The accuracy obtained after testing the recognition model is 99.98%. The precision, specificity, recall and F1 score exceed 0.9955, while the error and the false positive rate are almost zero, following the conducted experiments. The performance of the biometric system has proven to be gratifying.
Download

Paper Nr: 16
Title:

Hand Gesture Recognition Using MediaPipe Landmarks and Deep Learning Networks

Authors:

Manuel Gil-Martín, Marco Raoul Marini, Iván Martín-Fernández, Sergio Esteban-Romero and Luigi Cinque

Abstract: .Advanced Human Computer Interaction techniques are commonly used in multiple application areas, from entertainment to rehabilitation. In this context, this paper proposes a framework to recognize hand gestures using a limited number of landmarks from the video images. This hand gesture recognition system comprises an image processing module that extracts and processes the coordinates of 21 hand points called landmarks, and a deep neural network module that models and classifies the hand gestures. These landmarks are extracted automatically through MediaPipe software. The experiments were carried out over the IPN Hand dataset in an independent-user scenario using a Subject-Wise Cross Validation. They cover the use of different landmark-based formats, normalizations, lengths of the gesture representations, and number of landmarks used as inputs. The system obtains significantly better accuracy when using the raw coordinates of the 21 landmarks through 125 timesteps and a light Recurrent Neural Network architecture (80.56 ± 1.19 %) or the hand anthropometric measures (82.20 ± 1.15 %) compared to using the speed of the hand landmarks through the gesture (72.93 ± 1.34 %). The proposed framework studied the effect of different landmark-based normalizations over the raw coordinates, obtaining an accuracy of 83.67 ± 1.12 % when using as reference the wrist landmark from each frame, and an accuracy of 84.66 ± 1.09 % when using as reference the wrist landmark from the first video frame of the current gesture. In addition, the proposed solution provided high recognition performance even when only using the coordinates from 6 (82.15 ± 1.16 %) or 4 (81.46 ± 1.17 %) specific hand landmarks using as reference the wrist landmark from the first video frame of the current gesture.
Download

Paper Nr: 21
Title:

Together Is Better! Integrating BDI and RL Agents for Safe Learning and Effective Collaboration

Authors:

Manuel Parmiggiani, Angelo Ferrando and Viviana Mascardi

Abstract: What happens when symbolic knowledge is injected into the learning process of a sub-symbolic agent? What happens when symbolic and sub-symbolic agents collaborate? And what happens when they do not? This paper explores an innovative combination of symbolic – i.e., Belief-Desire-Intention (BDI) – and sub-symbolic – i.e., Reinforcement Learning (RL) – agents. The combination is achieved at two different logical levels: at the single agent level, we show how symbolic knowledge may be exploited to drive the learning process of a RL agent; at the multiagent system level, we show how purely BDI agents and purely RL agents behave in the complex scenario of the ‘Among Us’ videogame, and – more interestingly – what happens when BDI agents compete against RL agents, and when BDI and RL agents cooperate to achieve their goals.
Download

Paper Nr: 23
Title:

A Pattern-Based Approach to Name and Address Parsing with Active Learning

Authors:

Onais Khan Mohammed, Khizer Syed, John Talburt, Adeeba Tarannum, Abdul Kareem Khan Kashif, Salman Khan, Najmudin Syed and Syed Yaser Mehdi

Abstract: Processing population data often requires parsing demographic items into a standard set of fields to achieve metadata alignment. This paper describes a novel approach based on token pattern mappings augmented by active learning. Input strings are tokenized and a token mask is created by replacing each token with a single-character code indicating the token’s potential function in the input string. A user-created mapping then directs each token represented in the mask to its correct functional category. Testing has shown the system to be as accurate as, and in some cases, more accurate than comparable parsing systems. The primary advantage of this approach over other systems is that it allows a user to easily add a new mapping when an input does not conform to any previously encoded mappings instead of having to reprogram system parsing rules or retrain a supervised parsing machine learning model.
Download

Paper Nr: 32
Title:

SwarmPrompt: Swarm Intelligence-Driven Prompt Optimization Using Large Language Models

Authors:

Thilak Shekhar Shriyan, Janavi Srinivasan, Suhail Ahmed, Richa Sharma and Arti Arya

Abstract: The advancement of generative AI and large language models (LLMs) has made developing effective text prompts challenging, particularly for less experienced users. LLMs often struggle with nuances, tone, and context, necessitating precise prompt engineering for generating high-quality outputs. Previous research has utilized approaches such as gradient descent, reinforcement learning, and evolutionary algorithms for optimizing prompts. This paper introduces SwarmPrompt, a novel approach that employs swarm intelligence-based optimization techniques, specifically Particle Swarm Optimization and Grey Wolf Optimization, to enhance and optimize prompts. SwarmPrompt combines the language processing capabilities of LLMs with swarm operators to iteratively modify prompts and identify the best-performing ones. This method reduces human intervention, surpasses human-engineered prompts, and decreases the time and resources required for prompt optimization. Experimental results indicate that SwarmPrompt outperforms human-engineered prompts by 4% for classification tasks and 2% for simplification and summarization tasks. Moreover, SwarmPrompt converges faster, requiring half the number of iterations while providing superior results. This approach offers an efficient and effective alternative to existing methods. Our code is available at SwarmPrompt.
Download

Paper Nr: 34
Title:

Constraint-Based Optimization for Scheduling Medical Appointments

Authors:

George Assaf, Sven Löffler and Petra Hofstedt

Abstract: In this paper, we introduce a novel approach for solving as well as optimizing the medical appointment scheduling problem (MASP) using constraint programming. The MASP is a complex and critical task in health care management, directly impacting both patient care and operational efficiency. We formalize the MASP as a set of constraints that encode diverse requirements for scheduling medical appointments, including intuitive requirements such as the availability of both patients and medical resources, including physicians, nurses and medical equipment. Furthermore, our model accounts for patient preferences, such as favoring specific dates and/or particular resources whenever feasible. The proposed method incorporates optimization techniques that enhance the scheduling process by considering appointment urgency and balancing workload distribution among the assigned resources, thereby improving the allocation of medical resources. As an outcome, our constraint model demonstrates high efficiency by scheduling medical appointments in just milliseconds.
Download

Paper Nr: 40
Title:

Hallucinations in LLMs and Resolving Them: A Holistic Approach

Authors:

Rajarshi Biswas, Sourav Dutta and Dirk Werth

Abstract: Generative artificial intelligence, in recent times, is producing tremendous interest across industry and academia leading to rapid growth. Developments in model architecture, training datasets and large scale computing enable the realization of impressive generative tasks in textual computing, computer vision etc. However, the generative processes suffer from various challenging artifacts that can generate confusion, risks or compromise the security. In this paper, we explore in detail the problem of inconsistent or hallucinogenic generation in natural language generation (NLG). We define the problem and survey the current techniques for detection, measurement and mitigation on five different tasks, which are, abstractive summarization, question answering, dialogue generation, machine translation and named entity recognition combined with information retrieval.
Download

Paper Nr: 58
Title:

Examination of Document Clustering Based on Independent Topic Analysis and Word Embeddings

Authors:

Riku Yasutomi, Seiji Yamada and Takashi Onoda

Abstract: In recent years, research on text mining, which aims to extract useful information from textual data, has been actively conducted. This paper focuses on document classification methods that extract topics from textual data and assign documents to the extracted topics. Among these methods, the most representative is Latent Di-richlet Allocation (LDA). However, it has been pointed out that LDA often extracts similar topics due to the high amount of shared information between topics. Therefore, this paper proposes a document classification method based on Independent Topic Analysis (ITA), which extracts topics based on the independence of topics, and on Word Embedding, which learn word co-occurrence. This approach aims to avoid extracting similar topics and to achieve information grouping that is closer to human intuition. As a comparative metric, we used the agreement rate between the results of manually classifying documents into topics and those classified by each method. The results of the comparative experiment showed that the agreement rate for document classification based on ITA and Word Embedding was the highest. From these results, it was suggested that the proposed method could achieve document classification closer to human perception.
Download

Paper Nr: 65
Title:

CAMMA: A Deep Learning-Based Approach for Cascaded Multi-Task Medical Vision Question Answering

Authors:

Teodora-Alexandra Toader, Alexandru Manole and Gabriela Czibula

Abstract: Medical Visual Question Answering is a multi-modal problem which combines visual and language information to address medical inquiries, offering potential benefits in computer-aided diagnosis and medical education. Deep Learning has proven effective in this area, however the scarcity of data remains an issue for this data-hungry approach. To tackle this, we propose CAMMA, a cascaded multi-task architecture for Medical Visual Question Answering, achieving state-of-the-art results on the OVQA dataset with 71.45% accuracy. The model has all the advantages of a multi-task network, reducing overfitting and increasing data efficiency by capitalizing on the additional output information for each input sample. To test the adaptability of our model, we apply the same method on the VQA-Med 2019 dataset. We experiment with the choice of objectives included in the multi-task framework and the weighting between them.
Download

Paper Nr: 72
Title:

Deep Neural Network Based Algorithm for Recognition of Static Signs of Polish Sign Language

Authors:

Wiktor Barańczyk and Piotr Duch

Abstract: Developing sign language recognition algorithms is important for promoting accessibility and inclusion for deaf and hard-of-hearing individuals, improving education, and advancing technological applications in various fields. This paper presents a novel approach for recognizing static signs of Polish Sign Language using characteristic points and deep neural networks. As an input to the deep neural network the distances between every landmark of hands, elbows, and shoulders were used. The study focused on exploring the effectiveness of using deep learning techniques for sign recognition. The proposed algorithm was evaluated on two publicly available databases (NUS and LSA16) and achieved higher or comparable accuracy to other algorithms. Additionally, it was tested on a collected database of photographs of 24 people. The proposed algorithm achieved 96.45% accuracy, 96.15% recall, and 96.66% precision.
Download

Paper Nr: 74
Title:

Optimizing Object Detection for Maritime Search and Rescue: Progressive Fine-Tuning of YOLOv9 with Real and Synthetic Data

Authors:

Luciano Lima, Fabio Andrade, Youcef Djenouri, Carlos Pfeiffer and Marcos Moura

Abstract: The use of unmanned aerial vehicles for search and rescue (SAR) brings a series of advantages and reduces the time required to find survivors. It is possible to use computer vision algorithms to automate person detection, enabling a faster response from the rescue team. A major challenge in training image detection systems is the availability of data. In the SAR context, it can be more challenging as datasets are scarce. A possible solution is to use a virtual environment to generate synthetic data, which can provide an almost unlimited amount of data already labeled. In this work, the use of real and synthetic data for training the model YOLOv9t in maritime search and rescue operations is explored. Different proportions of real data were used for training a model from the scratch and for transfer learning by fine-tuning the model after being pretrained with synthetic data generated in Unreal Engine 4, to evaluate the performance aiming to reduce the reliance on real-world datasets. The total amounts of real and synthetic data were kept the same to ensure fair comparison. Fine-tuning a model pretrained on synthetic data with just 10% real data improved performance by 13.7% compared to using real data alone. An important finding is that the best performance was achieved with 70‘% real data instead a model trained solely on 100‘% real data. These results show that combining synthetic and real data enhances detection accuracy while reducing the need for large real-world datasets.
Download

Paper Nr: 77
Title:

Early Detection of Harmful Algal Blooms Using Majority Voting Classifier: A Case Study of Alexandrium Minutum, Pseudo-Nitzschia Australis and Pseudo-Nitzschia Fraudulenta

Authors:

Abir Loussaief, Raïda Ktari, Yessine Hadj Kacem and Fatma Abdmouleh

Abstract: Harmful algal blooms (HABs) severely damage the environment with significant adverse effects on marine life and human beings. An accurate prediction of HAB events is equally important in bloom management. This work investigates machine learning models to predict HAB occurrences, specifically focusing on three toxic species: Alexandrium minutum, Pseudonitzschia australis, and Pseudonitzschia fraudulenta. A majority voting ensemble method was implemented to improve the prediction performance by integrating the strength of different individual classifiers. Furthermore, the Synthetic Minority Oversampling Technique (SMOTE) was used to handle the class imbalance problem, which aided in enhancing bloom detection of rare occurrences. Compared with individual classifiers, the majority voting ensemble achieved better performance degrees with balanced accuracies of 99.09%, 99.57%, and 97.56% for Alexandrium minutum, Pseudonitzschia australis, and Pseudonitzschia fraudulenta datasets, respectively. These findings highlight the potential of combining ensemble methods and data augmentation for improving HAB predictions, thereby contributing to more active observing and mitigation strategies.
Download

Paper Nr: 78
Title:

Traffic Sign Orientation Estimation from Images Using Deep Learning

Authors:

Raluca-Diana Chiş, Mihai-Adrian Loghin, Cristina Mierlă, Horea Bogdan Mureşan and Octav-Cristian Florescu

Abstract: This study presents our findings on estimating the horizontal rotation angle (yaw) of traffic signs from 2D images using deep learning techniques. The aim is to introduce novel approaches for accurately estimating a traffic sign’s orientation, with applications in automatic map generation. The primary goal is to associate a traffic sign with a road correctly. The main challenge consists of both attempting to estimate the left/right orientation of a sign from 2D images and accurately estimating the rotation of the sign in degrees. Our approach involves the usage of a classifier for determining the orientation of a traffic sign in relation to the observer. Furthermore, we tried to transfer the weights obtained from classification to regression models and study the impact on performance. Our best results are obtaining an L1 loss as low as 10.34◦for yaw estimation and an accuracy equal to 62% for orientation class assessment. The image data was obtained from Grab’s Kartaview platform and was split into training/validation/testing while accounting for traffic sign class and shape balancing.
Download

Paper Nr: 82
Title:

Multidimensional User Profile Model to Support System Recommendations in Complex Social Networks: Application to Hashtag Recommendations

Authors:

Abir Gorrab, Wala Rebhi, Narjes Bellamine Ben Saoud and Henda Hajjami Ben Ghezala

Abstract: Recommendation systems play a crucial role in providing relevant information through data analysis. One of the pivotal challenges in the recommendation process is modeling user profiles. However, many existing models focus on a single aspect to describe users, overlooking other valuable data. In response to this lim-itation, this paper introduces a comprehensive multidimensional model that captures various dimensions of a user within their complex social network. This model encompasses demographic, social, behavioral and homophilic dimensions, with the goal of offering more holistic recommendations tailored to different contexts. Towards the end of this article, we introduce a focused application of the multidimensional model. This specific application revolves around providing hashtag recommendations within the X platform (Twitter platform). This serves as a tangible demonstration of how the proposed model can be applied in a practical context within a real social network. The main goal is to comprehensively assess the model’s efficacy in generating recommendations by utilizing a varied set of user-related information. To accomplish this, we introduce and evaluate a recommendation approach driven by our proposed user profile model, showcasing relevant and notable results.
Download

Paper Nr: 85
Title:

Machine Learning-Driven Monitoring for Early Detection and Management of Prediabetes

Authors:

Wesam A. Ali and Adeem Ali Anwar

Abstract: Prediabetes is a critical metabolic condition that acts as the precursor for type 2 diabetes (T2D). Early detection and management of prediabetes can prevent the onset of diabetes and associated complications. For individuals with prediabetes, having a reliable way to estimate their risk of developing T2D is crucial, as it helps them to keep their glycemic levels on track and may even enable them to regress to normoglycemia. Building on this, we propose a methodology to predict the progression rate of prediabetes. In this study, we enhanced the preexisting dataset by incorporating risk progression and risk probability using logistic regression. Moreover, we predicted the progression rate of prediabetes using machine learning-based approaches and performed comparative analysis using logistic regression, random forest, decision tree, gradient boosting, neural networks, and support vector machines. Utilizing key health indicators such as age, body mass index (BMI), gender, and comorbidities as characteristic factors of prediabetes progression. The results demonstrate that logistic regression outperforms other models with an accuracy of 99.93%, a precision of 99.92%, and an AUC-ROC of 1.0000, making it the most suitable model for predicting prediabetes risk. The proposed system offers a promising solution for real-time prediabetes monitoring.
Download

Paper Nr: 101
Title:

Beyond Compute: A Weighted Framework for AI Capability Governance

Authors:

Eleanor Nell Watson

Abstract: Current AI governance metrics, focused primarily on computational power, fail to capture the full spectrum of emerging AI risks and capabilities, which risks significant unintended consequences. This analysis explores critical alternative paradigms including logic-based scaffolding techniques, graph search algorithms, agent ensembles, mixture-of-experts architectures, distributed training methods, and novel computing approaches such as biological organoids and photonic systems. By examining these as multidimensional weighted factors, this research aims to expand the discourse on AI progress beyond compute-centric models, culminating in actionable policy recommendations to strengthen frameworks like the EU AI Act in addressing the diverse challenges of AI development.
Download

Paper Nr: 105
Title:

Multiple Crack Detection in Beam-Like Structures Using a Novel Particle Swarm Optimization Approach

Authors:

Flaviu-Catalin Florea, Horea Grebla, Gilbert-Rainer Gillich, Bogdan Nicușor Bindea and Catalin V. Rusu

Abstract: This paper presents a method for assessing two cracks in simply supported beams by identifying their locations and severities (depths). Our method is based on applying the Particle Swarm Optimization (PSO) algorithm with the measured natural frequencies for several bending vibration modes of an intact and cracked beam. We are using calculated relative frequency shifts (RFS) for eight vibration modes for all possible damage cases using a mathematical relation deduced in previous researches. We detect changes, calculate the RFSs and then subtract, separately for all modes, the measured RFSs from all calculated RFSs. Considering previously demonstrated applications of PSO for one crack detection, we propose strategies to enable PSO to determine locations in scenarios involving two cracks. Our method is successful in accurately identifying two damage locations and severities.
Download

Paper Nr: 106
Title:

Estimate Reference Evapotranspiration Using Machine Learning Methods

Authors:

Marwa Dorai, Mehrez Abdellaoui, Bouthaina Douh and Ali Douik

Abstract: Agriculture, a fundamental pillar of human civilisation, not only provides the food we need to survive, but is also a major driver of global economic growth. Yet this critical sector is increasingly threatened by the escalating impacts of climate change, particularly through the exacerbation of water scarcity in key agricultural regions. Changing climate patterns are disrupting rainfall cycles, leading to more frequent droughts and reduced water availability. As the global population grows exponentially and demand rises, farmers require water for irrigation to meet these needs. This growing resource scarcity underscores the urgent need for innovative, sustainable agricultural solutions to adapt to these challenges. To secure the future of water resources and safeguard agricultural productivity, it is crucial to proactively implement cutting-edge technologies such as the Internet of Things (IoT) and Artificial Intelligence (AI). In this context, we present a novel approach for estimating reference evapotranspiration ET0 with the aim of minimising water waste and improving the efficiency of irrigation water management. The study was carried out in a real-world setting where several sensors were installed to measure various parameters, including temperature, soil moisture and rainfall. The station is connected to a server application from which a dataset was generated after data cleaning and pre-processing. The parameters obtained from the dataset were classified in terms of their correlation with the output value ET0. Regression was then performed using various machine learning (ML) tools to predict water stress. The developed algorithms resulted in good performances in terms of coefficient of determination R2 and loss function RMSE. These performances exceed those of existing methods from the state of the art.
Download

Paper Nr: 113
Title:

Enhanced Credit Card Fraud Detection Using Federated Learning, LSTM Models, and the SMOTE Technique

Authors:

Weddou Mohamedhen, Maha Charfeddine and Yessine Hadj Kacem

Abstract: In recent years, credit card transaction fraud has caused significant financial losses for both consumers and financial institutions. To effectively combat these losses, the development of a sophisticated fraud detection system is necessary. However, credit card fraud detection (CCFD) presents significant challenges, particularly in regards to data security and privacy, limiting financial institutions’ ability to share transaction data for model training. This paper introduces the use of Federated Learning for CCFD, a technique that allows for decentralized learning while protecting data privacy. Federated Learning enables multiple institutions to collaborate on model training without having to share sensitive data, effectively addressing privacy concerns. To address the problem of class imbalance in fraud detection datasets, we apply the Synthetic Minority Oversampling Technique (SMOTE) to ensure a balanced dataset. Our study compares Long Short-Term Memory (LSTM) networks to Convolutional Neural Networks (CNN) within a Federated Learning framework. The experimental results demonstrate that combining SMOTE and LSTM in a Federated Learning setup produces superior performance. These findings highlight the strength of LSTM models in processing sequential transaction data and reveal that Federated Learning, when paired with resampling techniques, strengthens fraud detection accuracy.
Download

Paper Nr: 117
Title:

AgeGen Bio Track: Continuous Mouse Behavioral Biometrics-Based Age and Gender Profiling in Online Education Platforms

Authors:

Aditya Subash, Insu Song, Ickjai Lee and Kyungmi Lee

Abstract: Mouse behavioral biometric-based authentication systems have attracted significant attention as they are considered a more secure alternative to conventional online assessment fraud detection systems. This is attributed to their ability to continuously authenticate users non-intrusively by analyzing their distinctive mouse operating behavior. Most behavioral biometric-based research studies focus on predicting user identity as the primary objective for online assessment fraud detection. However, they do not consider predicting other user-centric parameters like age and gender. Furthermore, there is a need to identify the best segmentation approach and mouse behavior feature set for age and gender classification. We propose the AgeGen Bio track system, a continuous mouse behavioral biometric-based age and gender tracking system for online education platforms. To accomplish this, we first collect novel mouse behavior data with user demographic information. We then evaluate the efficacy of different segmentation approaches, feature sets, and machine learning models for age and gender classification. Experimental results show that the random forest algorithm paired with the three mouse-movement segmentation approach and user characteristic feature set are the best approaches that need to be incorporated into the system, as they achieved promising results.
Download

Paper Nr: 135
Title:

Grounding a Social Robot’s Understanding of Words with Associations in a Cognitive Architecture

Authors:

Thomas Sievers, Nele Russwinkel and Ralf Möller

Abstract: Social robots and humans need a common understanding of the current situation in order to interact and solve tasks together. They should know what the other one is talking about and refer to the same things. Word associations can help to find a common conceptual ground by enabling the robot to learn an association model of a human counterpart with regard to certain words and take them into account for its actions. This grounding of abstract words and ideas helps to constrain possible meanings. A model of a cognitive architecture connected to a social robot stores and processes chunks of memory from a language game between the robot and a human. The robot gives two words and keeps a third in mind. The human is asked to name a word associated with the two given words. In this way, an association model of the conceptual contexts of the human interaction partner is created. The dialog parts of the robot are generated with ChatGPT from OpenAI. An ACT-R model analyzes the data received from the robot, searches for suitable associations already in memory and, if applicable, provides feedback on these associations preferred by the human.
Download

Paper Nr: 154
Title:

Design and Implementation of a Data Model for AI Trustworthiness Assessment in CCAM

Authors:

Ruben Naranjo, Nerea Aranjuelo, Marcos Nieto, Itziar Urbieta, Javier Fernández and Itsaso Rodríguez-Moreno

Abstract: Amidst the growing landscape of trustworthiness-related initiatives and works both in the academic community and from official EU groups, there is a lack of coordination in the nature of the concepts used in these works and their relationships. This lack of coordination generates confusion and hinders the advances in trustworthy AI systems. This confusion is particularly grave in the CCAM domain given nearly all functionalities related to vehicles are safety-critical applications and need to be perceived as trustworthy in order for them to become available to the general public. In this paper, we propose the use of a defined set of terms and their definitions, carefully selected from the existing reports, regulations, and academic papers; and construct an ontology-based data model that can assist any user in the comprehension of those terms and their relationship to one another. In addition, we implement this data model as a tool that guides users on the self-assessment of the trustworthiness of an AI system. We use a graph database that allows making queries and automating the assessment of any particular AI system. We demonstrate the latter with a practical use case that makes an automated trustworthiness assessment based on user-inputted data.
Download

Paper Nr: 172
Title:

A Refined Multilingual Scene Text Detector Based on YOLOv7

Authors:

Houssem Turki, Mohamed Elleuch and Monji Kherallah

Abstract: In recent years, significant advancements in deep learning and the recognition of text in natural scene images have been achieved. Despite considerable progress, the efficacy of deep learning and the detection of multilingual text in natural scene images often face limitations due to the lack of comprehensive datasets that encompass a variety of scripts. Added to this is the absence of a robust detection system capable of overcoming the majority of existing challenges in natural scenes and taking into account in parallel the characteristics of each writing of different languages. YOLO (You Only Look Once) is a highly utilized deep learning neural network that has become extremely popular for its adaptability in addressing various machine learning tasks. YOLOv7 is an enhanced iteration of the YOLO series. It has also proven to be effective in solving complex image-related problems thanks to the evolution of its 'Backbone' responsible for capturing the features of images to overcome the challenges encountered in a natural environment which leads us to adapt it to our text detection context. Our first contribution is to over-come environmental variations through the use of specific data augmentation based on improved basic techniques and a mixed transformation method applied to “RRC-MLT” and “SYPHAX” multilingual datasets which both contain Arabic scripts. The second contribution is the refinement of the 'Backbone' block of the YOLOv7 architecture to better extract the small details of the text which particularly stand out in Arabic scripts in punctuation marks. The article highlights future research directions aimed at developing a generic and efficient multilingual text detection system in the wild that also handles Arabic scripts, which is a new challenge that adds to the context, which justifies the choice of the two datasets.
Download

Paper Nr: 173
Title:

Propagation-Based Domain-Transferable Gradual Sentiment Analysis

Authors:

Célia da Costa Pereira, Claude Pasquier and Andrea G. B. Tettamanzi

Abstract: We propose a novel refinement of a gradual polarity propagation method to learn the polarities of concepts and their uncertainties with respect to various domains from a labeled corpus. Our contribution consists of introducing a positive correction term in the polarity propagation equation to counterbalance negative psychological bias in reviews. The proposed approach is evaluated using a standard benchmark, showing an improved performance relative to the state of the art, good cross-domain transfer and excellent coverage.
Download

Paper Nr: 197
Title:

Synthesizing Annotated Cell Microscopy Images with Generative Adversarial Networks

Authors:

Duway Nicolas Lesmes-Leon, Miro Miranda, Maria Caroprese, Gillian Lovell, Andreas Dengel and Sheraz Ahmed

Abstract: Data scarcity and annotation limit the quantitation of cell microscopy images. Data acquisition, preparation, and annotation are costly and time-consuming. Additionally, cell annotation is an error-prone task that requires personnel with specialized knowledge. Generative artificial intelligence is an alternative to alleviate these limitations by generating realistic images from an unknown data probabilistic distribution. Still, extra effort is needed since data annotation remains an independent task of the generative process. In this work, we assess whether generative models learn meaningful instance segmentation-related features, and their potential to produce realistic annotated images. We present a single-channel grayscale segmentation mask pipeline that differentiates overlapping objects while minimizing the number of labels. Additionally, we propose a modified version of the established StyleGAN2 generator that synthesizes images and segmentation masks simultaneously without additional components. We tested our generative pipeline with LIVECell and TissueNet, two benchmark cell segmentation datasets. Furthermore, we augmented a segmentation deep learning network with synthetic samples and illustrated improved or on-par performance compared to its non-augmented version. Our results support that the features learned by generative models are relevant in the annotation context. With adequate data preparation and regularization, generative models are capable of producing realistic annotated samples cost-effectively.
Download

Paper Nr: 202
Title:

GAna: Model Generators and Data Analysts for Streamlined Processing

Authors:

Stephanie C. Fendrich, Philipp Flügger, Annegret Janzso, David Kaub, Stefan Klein, Anna Kravets, Patrick Mertes, Nhat Tran, Jan Ole Berndt and Ingo J. Timm

Abstract: In the evolving landscape of Explainable AI, reliable and transparent data processing is essential to ensure trustworthiness in model development. While agent-based modeling and simulation are used to provide insights into complex systems, this becomes vital when applying results to decision-making processes. This paper presents the GAna workflow — an approach that integrates model generation and data analysis to streamline the workflow from data preprocessing to result interpretation. By automating data handling and facilitating the reuse of processed and generated data, the GAna workflow significantly reduces the manual effort and computational expense typically associated with creating synthetic populations and other data-intensive tasks. We demonstrate the effectiveness of the workflow through two distinct case studies, highlighting its potential to enhance transparency in AI applications.
Download

Paper Nr: 221
Title:

Leveraging Machine Learning in American Sign Language Recognition

Authors:

Lyth Khaled Al-Shbeilat, Anis Mezghani, Monji Kherallah and Faiza Charfi

Abstract: Understanding and recognizing sign language plays a role in connecting hearing and deaf communities effectively. This research assesses machine learning models for recognizing American Sign Language gestures by utilizing a dataset structured similarly to the traditional MNIST format. The study experimented with Decision Trees Multi-Layer Perceptron (MLP), and K-Nearest Neighbours. To enhance the models’ performance measures, scaling normalization and handling outliers were employed during the preprocessing stage. Each classifier was methodically adjusted to enhance accuracy levels. The experimental findings showed that the KNN achieved an accuracy rate of 99.4% surpassing the MLP which achieved 98.9%. In contrast to this performance level is the Decision Tree algorithm which displayed an accuracy rate of 68.9% after optimizing its parameters. These results indicate that both KNN and MLP stand out as models for recognizing ASL gestures due to their ability to capture patterns and spatial relationships within the dataset. The study also demonstrates the effective impact of data pre-processing in improving the accuracy of the results. This study adds value to the field by conducting a comparison of machine learning models to understand their effectiveness, in recognizing ASL gestures and identifying both their advantages and shortcomings for potential enhancements, in recognition technology.

Paper Nr: 224
Title:

Coconut Palm Tree Counting on Drone Images with Deep Object Detection and Synthetic Training Data

Authors:

Tobias Rohe, Barbara Böhm, Michael Kölle, Jonas Stein, Robert Müller and Claudia Linnhoff-Popien

Abstract: Drones have revolutionized various domains, including agriculture. Recent advances in deep learning have propelled among other things object detection in computer vision. This study utilized YOLO, a real-time object detector, to identify and count coconut palm trees in Ghanaian farm drone footage. The farm presented has lost track of its trees due to different planting phases. While manual counting would be very tedious and error-prone, accurately determining the number of trees is crucial for efficient planning and management of agricultural processes, especially for optimizing yields and predicting production. We assessed YOLO for palm detection within a semi-automated framework, evaluated accuracy augmentations, and pondered its potential for farmers. Data was captured in September 2022 via drones. To optimize YOLO with scarce data, synthetic images were created for model training and validation. The YOLOv7 model, pretrained on the COCO dataset (excluding coconut palms), was adapted using tailored data. Trees from footage were repositioned on synthetic images, with testing on distinct authentic images. In our experiments, we adjusted hyperparameters, improving YOLO’s mean average precision (mAP). We also tested various altitudes to determine the best drone height. From an initial mAP@.5 of 0.65, we achieved 0.88, highlighting the value of synthetic images in agricultural scenarios.
Download

Paper Nr: 225
Title:

On Enhancing Code-Mixed Sentiment and Emotion Classification Using FNet and FastFormer

Authors:

Anuj Kumar, Amit Pandey, Satyadev Ahlawat and Yamuna Prasad

Abstract: Code-mixing, the blending of multiple languages within a communication, is becoming increasingly common on social media. If left unchecked for sentiment analysis, this trend can lead to hate speech or violence, emphasizing the need for advanced techniques to interpret emotions and sentiments in code-mixed languages accurately. Current research has mainly focused on code-mixed text involving a limited number of languages. However, these methods often yield suboptimal results due to inadequate feature extraction by existing learning models. Additionally, achieving high accuracy and extracting meaningful features from code-mixed text remains a significant challenge. To address this, we propose two transformer-based feature extraction methods for sentiment and emotion classification in code-mixed text. The first method integrates the Fourier transform into the transformer-based cross-lingual language model, XLM-Roberta, by incorporating the encoder layers of Fourier Net (FNet). This Fourier encoder layer applies a Fourier transform to the final output vector of hidden states, enabling the model to capture complex patterns more effectively. The second method incorporates the encoding layers of FastFormer into the XLM-Roberta framework. FastFormer generates contextual embeddings using additive attention mechanisms, allowing for extracting more effective contextual features. Experimental results show that the proposed approaches improve accuracy compared to the state-of-the-art by 1.5% and 0.9% in sentiment detection and 3.9% and 1.97% in emotion detection on the publicly available SentiMix code-mixed benchmark dataset.
Download

Paper Nr: 235
Title:

Fine Tuning LLMs vs Non-Generative Machine Learning Models: A Comparative Study of Malware Detection

Authors:

Gheorghe Balan, Ciprian-Alin Simion and Dragoş Teodor Gavriluţ

Abstract: The emergence of Generative AI has provided various scenarios where Large Language Models can be used to replace older technologies. Cyber-security industry has been an early adopter of these technologies, but in particular for scenarios that involved security operation centers, support or cyber attack visibility. This paper aims to compare how well Large Language Models behave against traditional machine learning models for malware detection wrt. various constrains that apply to a security product such as inference time, memory footprint, detection and false positive rate. In this paper we have fine tuned 3 open source models (LLama2-13B, Mistral, Mixtral) and compared them with 18 classical machine learning models (feed forward neural networks, SVMs, etc) using more than 135,000 benign and malicious binary samples. The goal was to identify scenarios/cases where large language models are suited for the task of malware detection.
Download

Paper Nr: 236
Title:

LSTM-Based Physics-Informed Neural Network for Lithium-Ion State of Charge Estimation

Authors:

Yusif Imamverdiyev, Amel Hidouri, Tedjani Mesbahi, Ahmed Samet and Christophe Lallement

Abstract: This paper introduces an advanced hybrid modeling framework that combines Physics-Informed Neural Networks (PINNs) with Long Short-Term Memory (LSTM) networks to improve the accuracy and reliability of time-series predictions. The proposed LSTM-PINN model is designed to capture intricate temporal dependencies while strictly adhering to physical laws, making it particularly effective for applications such as battery state estimation and energy storage management. The study focuses on enhancing State of Charge (SOC) prediction by leveraging the complementary strengths of PINNs and LSTM networks. This integrated approach enables precise SOC estimation while maintaining consistency with the physical principles governing battery behavior. Furthermore, the research situates itself within the domain of SOC estimation by incorporating Equivalent Circuit Models (ECMs) alongside PINNs and LSTM networks. This integration improves the model’s robustness and predictive performance by effectively accounting for both electrochemical dynamics and historical usage patterns.

Paper Nr: 239
Title:

Automated News Scraping and AI-Powered Analysis for Municipal Crime Mapping

Authors:

Pedro Arthur P. S. Ortiz and Leandro O. Freitas

Abstract: This paper presents an innovative approach to urban crime mapping through automated web scraping and data analysis techniques, addressing the challenge of limited crime data availability in smaller municipalities. Focusing on Santa Maria, Brazil, we develop a methodology to extract, process, and visualize crime-related information from local news sources. Our approach combines web scraping using Selenium, natural language processing with the Claude API, and data visualization techniques to create a comprehensive crime dataset. Through implementation, we present heat maps of crime hotspots, temporal analysis of crime patterns, and statistical correlations between crime-related factors. The research examines hourly, daily, and seasonal crime patterns, providing insights for law enforcement resource allocation. We discuss challenges and ethical considerations of using web-scraped data, including privacy concerns, reporting bias, and verification challenges. While acknowledging limitations such as data bias and accuracy concerns, this research provides a foundation for data-driven urban crime prevention strategies. The methodology offers a scalable framework that could be implemented across various urban environments, contributing to more effective crime prevention and public safety strategies.
Download

Paper Nr: 240
Title:

ATFSC: Audio-Text Fusion for Sentiment Classification

Authors:

Aicha Nouisser, Nouha Khediri, Monji Kherallah and Faiza Charfi

Abstract: The diversity of human expressions and the complexity of emotions are specific challenges related to sentiment analysis from text and speech data. Models must consider not only text but also nuances of intonation and emotions expressed by voice. To address these challenges, we created a bimodal sentiment analysis model named ATFSC, that organizes emotions based on textual and audio information. It fuses textual and audio information from conversations, providing a more robust analysis of sentiments, whether negative, neutral, or positive. Key features include the use of transfer learning with a pre-trained BERT model for text processing, a CNN-based audio feature extractor for audio processing, and flexible preprocessing capabilities that support different dataset formats. An attention mechanism was employed to perform a bimodal fusion of audio and text features, which led to a notable performance optimization. As a result, we observed a performance amelioration in the accuracy values such as 64.61%, 69%, 72%, 81.36% on different datasets respectively IEMOCAP, SLUE, MELD, and CMU-MOSI.
Download

Paper Nr: 251
Title:

Enhanced QTTN Design: Scalable Quantum Circuits for Arbitrary Qubit Counts

Authors:

Krishnageetha Karuppasamy and Johnson P. Thomas

Abstract: We explore the design and implementation of Enhanced Quantum Tree Tensor Networks (EQTTNs) for Variational Quantum Circuits. A Quantum Tree Tensor Network (QTTN) offers a hierarchical structure to manage entanglement and optimize quantum operations. The traditional requirement for constructing a QTTN is that the number of qubits (n) must be in the form n=2^x. This paper proposes an EQTTNs design that can accommodate any number of qubits. This flexibility means there are no restrictions on the problem size, allowing for broader applicability and scalability in various quantum computing tasks. We provide a comprehensive analysis of the parameter count required for EQTTNs. Experimental results validate our theoretical model, in terms of fidelity score and entanglement strength.
Download

Paper Nr: 256
Title:

Time Series and Deep Learning Approaches for Predicting English Premier League Match Outcomes

Authors:

Weronika Wiechno, Bartosz Bartosik and Piotr Duch

Abstract: The continuous development of tools used in football match analysis has resulted in a greater availability of game statistics, providing analysts, coaches, and researchers with with more detailed data regarding the matches played. This results in the need for more advanced algorithms for effectively processing and interpreting the available information. In the paper, the modified architecture of the Siamese Neural Networks is presented. The time series approach is incorporated to capture temporal dynamics in teams’ performance throughout analysed matches. The algorithm was compared with classifiers and deep neural networks approaches commonly used for match outcome prediction in the literature. All methods were trained and tested on two prepared datasets with the same division into train and test sets. Finally, the proposed architecture outperforms others by reaching higher overall accuracy in match prediction outcomes.
Download

Paper Nr: 265
Title:

Learning to Run a Marathon: Avoid Overfitting to Speed

Authors:

Krisztián Gábrisch and István Megyeri

Abstract: Research and development in reinforcement learning is a dynamically evolving field, with a particular focus on robustness and continuous optimization of reward. The models learned in the OpenAI GYM and Mujoco environments investigated here seek to make different dummies move in one direction as fast as possible without losing stability. During the learning process, the models are usually trained for a predefined number of steps, which can act as a limiting factor and result in an unexpected limitation in the model performance. This iteration limitation can contribute to model instability, often leading to model failure, thus hindering the model’s ability to collect additional rewards. In our observations, we also note that models face a major problem in simultaneously optimizing their stability and speed. We traced the learning process of the models through twenty checkpoints, and defined various metrics to select the models that are most suitable for us. We have noticed that the model obtained at the last checkpoint does not always perform the best, so it is worth monitoring the learning process so we can get better models during the learning process. Our code and pretrained models are available at https://github.com/szegedai/rl run marathon.
Download

Paper Nr: 295
Title:

Tractable Generative Modelling of Cosmological Numerical Simulations

Authors:

Amit Parag and Vaishak Belle

Abstract: Cosmological simulations aim to understand the matter distribution in the universe by employing either semi-analytic methods or hydrodynamical models of matter distribution. These simulations describe the evolution of baryonic structures within dark matter potential wells, where dark matter is modeled as a self-gravitating, collisionless system. Despite advances in reducing computational costs, these simulations still require millions of CPU hours to achieve stable solutions. This raises the question: can generative models predict galaxy properties from a partial history of their dynamical evolution? Tractable probabilistic models, such as sum-product networks, enable efficient computation of conditional probabilities, allowing conditional marginals to be computed in time linear to the model size. In this work, we investigate the application of sum-product networks to compactly represent and learn distributions for predictions in concordance cosmology. Using the Eagle suite of cosmological hydrodynamical simulations, we demonstrate that these graphical models can effectively reproduce mock galaxy catalogs, capturing the relationship between baryonic and dark matter with promising accuracy.
Download

Paper Nr: 302
Title:

Formal Reasoning About Trusted Third Party Protocols

Authors:

Aaron Hunter

Abstract: A trusted third party (TTP) is an entity that facilitates communication between agents by acting as an intermediary. Typical roles for a trusted third party include the establishment of session keys or the validation of commitment schemes. In a formal setting, this requires a model that provides some mechanism for representing trust and reasoning about dynamic beliefs. In this paper, we demonstrate how this can be captured using a combined modal logic of trust and belief. Our formalism uses plausibility models and model transformations to capture belief revision in a protocol run. It is novel in that it uses the modal accessibility relations in the logic to define a notion of trust, without requiring any additional formal machinery. We define the formal semantics of the logic, sketch the axiomatization, and demonstrate the basic verification methodology. Challenges are discussed, as well as issues related to practical deployment.
Download

Paper Nr: 304
Title:

GNN-MSOrchest: Graph Neural Networks Based Approach for Micro-Services Orchestration - A Simulation Based Design Use Case

Authors:

Nader Belhadj, Mohamed Amine Mezghich, Jaouher Fattahi and Lassaad Latrach

Abstract: In recent years, the micro-services architecture has emerged as a dominant paradigm in software engineering, praised for its modularity, scalability, and ease of maintenance. Nevertheless, orchestrating micro-services efficiently presents significant challenges, particularly in optimizing communication, load balancing, and fault tolerance. Graph Neural Networks (GNN), with their ability to model and process data structured as graphs, are particularly well-suited for representing the complex inter dependencies between micro-services. Despite their promising applications in micro-services architecture, GNNs are not sufficiently used for micro-services orchestration, which involves the automated management, coordination, and scaling of services. This paper proposes a novel GNNs based approach for micro-services orchestration. A simulation based design use case is studied and analysed.
Download

Paper Nr: 308
Title:

A Leaf Disease Detection Using Machine Learning and Deep Learning: Comparative Study

Authors:

Mooad Al-shalout, Mohamed Elleuch and Ali Douik

Abstract: This study aims to provide innovative methods and additional suggestions for detecting plant diseases using deep learning techniques. The study focused on identifying diseases affecting major daily consumed plants, such as tomatoes, corn, and potatoes. The detected diseases included rust, early and late spots, mildew, and bacterial spots. The study relied on machine learning and deep learning algorithms, such as Support Vector Machine and VGG19 algorithm, to detect plant diseases. SIFT and Gabor filters were also incorporated into the work and tested using SVM algorithm. The study reached highly accurate results, as the accuracy rate reached 98% using SVM, and 97% using VGG19 algorithm, which are satisfactory results compared to previous studies, confirming the effectiveness of the methods used in detecting plant diseases.
Download

Paper Nr: 321
Title:

Natural Language Interface for Goal-Oriented Knowledge Graphs Using Retrieval-Augmented Generation

Authors:

Kosuke Yano, Yoshinobu Kitamura and Kazuhiro Kuwabara

Abstract: A search method leveraging Retrieval-Augmented Generation (RAG) for goal-oriented knowledge graphs is proposed, with a specific focus on function decomposition trees. A function decomposition tree represents hierarchically functions of artifacts or actions of human with explicit descriptions of purposes and goals. We developed a schema to convert the trees into RDF, enabling structured and efficient searches. Through RAG technology, a natural language interface converts user’s inputs into SPARQL queries, retrieving relevant data and subsequently presenting them in an accessible and chat-based format. Such a flexible, and purpose-driven searches enhance usability in complex knowledge graphs. We demonstrate the tool effectively retrieves actions, intentions, and dependencies using an illustrative and a real-world example of function decomposition trees.
Download

Paper Nr: 327
Title:

Proposal of a Method for Analyzing the Explainability of Similarity Between Short Texts in Spanish

Authors:

Isidro Jara Matas, Luis de la Fuente Valentin, Alfonso Ortega de la Puente and Javier Sanz Fayos

Abstract: The aim of this project is the design and implementation of a system for analyzing the explainability of similarity between short texts (< 200 words), in Spanish language, with a special focus on the academic domain. For the system implementation, different models based on the BERT architecture will be used. A concise analysis of the explainability of the proposed system will be conducted, aiming to understand the intrinsic functioning of the method and to provide feedback to stakeholders, such as the author of the evaluated text or the professional deciding to use the system. Furthermore, based on the obtained results, an estimation of the system's goodness will be carried out through statistical analysis. This will enable both a comparison with other possible implementations and the proposal of future improvements that could have a positive impact on a more realistic assessment of texts.
Download

Paper Nr: 328
Title:

Advancing Polycystic Ovary Syndrome Detection with Artificial Intelligence Techniques

Authors:

Abir Gorrab, Nourhène Ben Rabah, Isuri Kariyawasam and Bénédicte Le Grand

Abstract: Polycystic Ovary Syndrome (PCOS) is a common hormonal disorder that affects women of reproductive age. Diagnosis mainly relies on traditional methods, such as clinical evaluations or laboratory tests, which can be expensive and time-consuming and are often accompanied by complex imaging techniques. The integration of Artificial Intelligence (AI), namely Machine Learning (ML) and Deep Learning (DL), seems to offer promising opportunities, allowing for the analysis of large datasets to improve PCOS detection and management. This work conducts a systematic literature review and aims to explore how ML and DL can optimize PCOS diagnosis by analyzing the most used data and algorithms while following a rigorous methodology to ensure the validity of the results. It also discusses the explainability of AI methods to be used by healthcare professionals, who are always looking for reliable results to support the best possible diagnosis for their patients.
Download

Paper Nr: 336
Title:

A Deep Learning Approach to Minimize Retrieval Time in Shuttle-Based Storage Systems

Authors:

Paul Courtin, Jean-Baptiste Fasquel, Mehdi Lhommeau and Axel Grimault

Abstract: Improvement of picking performances in automated warehouse is influenced by the assignment of articles to storage locations. This problem is known as the Storage Location Assignment Problem (SLAP). In this paper, we present a deep learning method to assign articles to storage locations inside a shuttles-based storage and retrieval system (SBS/RS). We introduce the architecture of our a LSTM-based model and the public dataset used. Finally, we compare the retrieval time of articles provided by our model against other allocation methods.
Download

Paper Nr: 338
Title:

Translating Akkadian Transliterations to English with Transfer Learning

Authors:

Najat Nehme, Danielle Azar, Diana Kutsalo and Jalal Possik

Abstract: Akkadian is an ancient Semitic language with a complex cuneiform script and fragmented artifacts. These attributes make the translation of text from this language to another one very challenging. In this paper, we utilize transfer learning with a model pre-trained on multiple languages to enhance the accuracy to translate from Akkadian to English. By fine-tuning this model on a curated Akkadian-English dataset, the research aims to leverage the extensive linguistic pre-training of the model in order to adapt it to Akkadian’s specificities.
Download

Paper Nr: 341
Title:

Deep Learning-Based Vessel Traffic Prediction Using Historical Density and Wave Features

Authors:

Dogan Altan, Dusica Marijan and Tetyana Kholodna

Abstract: Sea traffic is fundamental information that needs to be considered while planning vessel operations to enhance navigational safety and operational efficiency. Therefore, several environmental constraints, such as weather and traffic conditions, must be taken into account to minimize delays caused by vessel traffic and improve safety by decreasing collision risks. In this paper, we address the vessel traffic prediction problem, which tackles predicting vessel traffic for ships using several sources of information. We propose a vessel traffic prediction method that processes information obtained from different sources indicating historical traffic and wave conditions for vessels. The proposed method consists of three models processing different types of features and fuses the outputs of these models for the vessel traffic prediction problem. We evaluate the proposed method on real-world historical vessel trajectories and report its performance by providing a comparison with other baselines. The experimental results indicate that our proposed method provides promising results for predicting vessel traffic with a mean squared error of 0.325.
Download

Paper Nr: 347
Title:

AI-Based Personalized Multilingual Course Recommender System Using Large Language Models

Authors:

Sourav Dutta, Florian Beier and Dirk Werth

Abstract: This paper presents an AI-driven personalized course recommender system designed to enhance user engagement and learning outcomes on educational platforms. Leveraging the EU DigComp competency framework, the system constructs detailed user profiles through a chat assistant that guides users in identifying relevant competency areas and completing tailored surveys. Course recommendations are generated based on a hybrid scoring model that integrates semantic similarity and competency alignment, ensuring that course suggestions are both contextually and skill-relevant. For users seeking structured guidance, the system offers a learning path feature, utilizing a large language model to suggest subsequent courses that align with the user’s interests and prior learning experiences. While traditional course recommenders often rely on simple keyword matching, our system dynamically combines user interests and competencies for nuanced recommendations across English and German courses. Screenshots of the system’s live demo showcase key functionalities, including chatbot-led profile creation, multilingual support, personalized learning paths. This paper highlights the ongoing development of the recommender system and discusses future plans to further refine and expand its personalized learning capabilities.
Download

Paper Nr: 355
Title:

Punish the Pun-ish: Enhancing Text-to-Pun Generation with Synthetic Data from Supervised Fine-tuned Models

Authors:

Tomohito Minami, Ryohei Orihara, Yasuyuki Tahara, Akihiko Ohsuga and Yuichi Sei

Abstract: Puns are clever wordplays that exploit sound similarities while contrasting different meanings. Such complex puns remain challenging to create, even with today’s advanced large language models. This study focuses on generating Japanese juxtaposed puns while preserving the original meaning of input sentences. We propose a novel approach, applying Direct Preference Optimization (DPO) after supervised fine-tuning (SFT) of a pre-trained language model, utilizing synthetic data generated from the SFT model to refine pun generation. Experimental results indicate that our approach yields a marked improvement, evaluated using neural network-based and rule-based metrics designed to measure pun-ness, with a 2.3-point increase and a 7.9-point increase, respectively, over the baseline SFT model. These findings suggest that integrating SFT with DPO enhances the model’s ability to capture phonetic nuances essential for generating juxtaposed puns.
Download

Paper Nr: 361
Title:

Price Drivers in Prediction Markets: An Agent-Based Model of Competing Narratives

Authors:

Arwa Bokhari

Abstract: In this paper, I investigate price formation in prediction markets via an agent-based model (ABM). Prediction market prices can be interpreted as the probability of an event occurring, based on the aggregated beliefs of market participants. By utilizing a simple market exchange populated with opinionated agents and calibrating the model parameters, I aim to identify the effect on market price introduced by the three main drivers of the opinion formation process within two competing groups of agents: self-reinforcement; herding; and additive responses to inputs. Using a real-world dataset of Bitcoin prices, I show that both groups tend to follow the overall market sentiment. However, when the market mood aligns with a particular group’s opinion, that group becomes more self-reinforcing; conversely, when the mood does not favour their opinion, they become less self-reinforcing. Furthermore, I propose to use the temporally generated parameter values—produced by the calibrated model—as well as the temporal prices and market moods shifted by seven days as the training set for a supervised machine learning and solve the multi-target learning problem to forecast both short-term price trends and the expected trajectory of the two groups’ opinion dynamics. The code from this research is available for other researchers to use, build upon, and extend.
Download

Paper Nr: 365
Title:

Machine Learning Approaches in the Detection of Amyotrophic Lateral Sclerosis Disease Using Orofacial Gestures

Authors:

Mara Hajdu Măcelaru, Rareş Chiuzbăian and Petrică Pop

Abstract: Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease that affects nerve cells in the brain and spinal cord, specifically the motor neurons. As far as we know, there is no single test that can definitively diagnose ALS, and the diagnosis is often based on a integration of clinical findings, medical history, physical examination and various tests to rule out other possible conditions and confirm the diagnosis. The present work proposes four machine learning (ML) algorithms: K-Nearest Neighbors, the Iterative Di-chotomizer 3, Naive Bayes and Logistic Regression to help the diagnosis of early signs of ALS disease. In order to test the proposed ML algorithms, we used the only existing data set, created by the Sunnybrook Research Institute in Toronto. Using the extracted images from the videos of the participants, we developed a system of recognition based on orofacial gestures of the early signs of ALS. The achieved experimental results show that the described ML techniques enable accurate ALS predictions and can be easily integrated into healthcare system for diagnostic use.
Download

Paper Nr: 368
Title:

Optimizing Automotive Inventory Management: Harnessing Drones and AI for Precision Solutions

Authors:

Qian Zhang, Dan Johnson, Mark Jensen, Connor Fitzgerald, Daisy Clavijo Ramirez and Mia Y. Wang

Abstract: Inventory errors within the automotive manufacturing industry pose significant challenges, incurring substantial financial costs and requiring extensive human labor resources. The inherent inaccuracies associated with traditional inventory management practices further exacerbate the issue. To tackle this complex problem, this paper explores the integration of cutting-edge technologies, including UAV (Unmanned Aerial Vehicle) drones, computer vision, and deep learning models, for monitoring inventory in parking lots adjacent to manufacturing plants and harbors before vehicle shipment. These technologies enable real-time, automated inventory tracking and management, offering a more accurate and efficient solution to the problem. Leveraging drones equipped with high-resolution cameras, the system captures real-time imagery of parked vehicles and their components, while deep learning models facilitate precise inventory analysis. This forward-looking approach not only mitigates the costs associated with inventory errors but also equips manufacturers with the agility to optimize their production processes, ensuring competitiveness within the automotive industry.
Download

Paper Nr: 382
Title:

Toolshed: Scale Tool-Equipped Agents with Advanced RAG-Tool Fusion and Tool Knowledge Bases

Authors:

Elias Lumer, Vamse Kumar Subbiah, James A. Burke, Pradeep Honaganahalli Basavaraju and Austin Huber

Abstract: Recent advancements in tool-equipped agents (LLMs) have enabled complex tasks like secure database interactions and code development. However, scaling tool capacity beyond agent reasoning or model limits remains a challenge. In this paper, we address these challenges by introducing Toolshed Knowledge Bases, a tool knowledge base (vector database) designed to store enhanced tool representations and optimize tool selection for large-scale tool-equipped agents. Additionally, we propose Advanced RAG-Tool Fusion, a novel ensemble of tool-applied advanced retrieval-augmented generation (RAG) techniques across the pre-retrieval, intra-retrieval, and post-retrieval phases, without requiring fine-tuning. During pre-retrieval, tool documents are enhanced with key information and stored in the Toolshed Knowledge Base. Intra-retrieval focuses on query planning and transformation to increase retrieval accuracy. Post-retrieval refines the retrieved tool documents, enables self-reflection, and equips the tools to the agent. Furthermore, by varying both the total number of tools (tool-M) an agent has access to and the tool selection threshold (top-k), we address trade-offs between retrieval accuracy, agent performance, and token cost. Our approach achieves 46%, 56%, and 47% absolute improvements on the ToolE single-tool, ToolE multi-tool and Seal-Tools benchmarks, respectively (recall@5).
Download

Paper Nr: 389
Title:

Analysis of Gaze Behavior for Constructing Learning Support Systems of Pass Behavior in First Person Perspective

Authors:

Norifumi Watanabe and Kota Itoda

Abstract: This study explores dynamic and adaptable human cooperative behavior in team sports such as soccer or handball, emphasizing the sharing of intentions and actions among players. A key factor in this context is the gaze direction of players, which is crucial for assessing situations and inferring teammates’ and opponents’ intentions, ultimately guiding practical actions. Recent advances in virtual reality (VR) technology have enabled detailed analysis of such behaviors and decision-making processes, facilitating experimental learning scenarios in cooperative settings. In this research, we investigate human gaze behavior in soccer from a first-person perspective, using head-mounted displays (HMDs) and virtual environments to develop supportive learning systems. Through experiments in which subjects experience offensive scenarios from a real player’s viewpoint within a VR environment, we analyze how their gaze behavior changes during different phases of passing and attacking in the game.
Download

Paper Nr: 396
Title:

Named Entity Discovery and Alignment in Parallel Data

Authors:

Zuzana Nevěřilová

Abstract: The paper describes two experiments with named entity discovery and alignment for English-Czech parallel data. In the previous work, we enriched the Parallel Global Voices corpus with named entity recognition (NER) for both languages and named entity linking (NEL) annotations for English. The alignment experiment employs sentence transformers and cosine similarity to identify NE translations from English to Czech and possibly other languages. The discovery experiment uses the same method to find possible translations between named entities in English and Czech n-grams. The described method achieves an F1 score of 0.94 in finding alignments between recognized entities. However, the same method can also discover unknown named entities with an F1 score of 0.70. The result indicates the method can be used to recognize named entities in parallel data in cases where no NER model is available with sufficient quality.
Download

Paper Nr: 398
Title:

Generative AI for Islamic Texts: The EMAN Framework for Mitigating GPT Hallucinations

Authors:

Amina El Ganadi, Sania Aftar, Luca Gagliardelli and Federico Ruozzi

Abstract: Recent advancements in large language models (LLMs) have facilitated specialized applications in fields such as religious studies. Customized AI models, developed using tools like GPT Builder to source information from authoritative collections such as Sahih al-Bukhari or the Qur’an, were explored as potential solutions to address inquiries related to Islamic teachings. However, initial evaluations highlighted significant limitations, including hallucinations and reference inaccuracies, which undermined their reliability for handling sensitive religious content. To address these limitations, this study proposes EMAN (Embedding Methodology for Authentic Narrations), a novel framework designed to enhance adherence to Sahih al-Bukhari through API-based integration. Three methodologies are examined within this framework: Zero-Shot Instructions, which guide the model without prior examples; Few-Shot Learning, which fine-tunes the model using a limited set of examples; and Embedding-Based Integration, which grounds the model directly in a verified Ahadith database. Results demonstrate that Embedding-Based Integration significantly improves performance by anchoring outputs in a structured knowledge base, reducing hallucination rates, and increasing accuracy. The success of this approach underscores its potential for enhancing LLM performance in precision-critical domains. This research provides a foundation for the ethical and accurate deployment of AI in religious studies, emphasizing accountability and fidelity to source material.
Download

Paper Nr: 414
Title:

Agent-Centric Projection of Prompting Techniques and Implications for Synthetic Training Data for Large Language Models

Authors:

Dhruv Dhamani and Mary Lou Maher

Abstract: Recent advances in prompting techniques and multi-agent systems for Large Language Models (LLMs) have produced increasingly complex approaches. However, we lack a framework for characterizing and comparing prompting techniques or understanding their relationship to multi-agent LLM systems. This position paper introduces and explains the concepts of linear contexts (a single, continuous sequence of interactions) and non-linear contexts (branching or multi-path) in LLM systems. These concepts enable the development of an agent-centric projection of prompting techniques, a framework that can reveal deep connections between prompting strategies and multi-agent systems. We propose three conjectures based on this framework: (1) results from non-linear prompting techniques can predict outcomes in equivalent multi-agent systems, (2) multi-agent system architectures can be replicated through single-LLM prompting techniques that simulate equivalent interaction patterns, and (3) these equivalences suggest novel approaches for generating synthetic training data. We argue that this perspective enables systematic cross-pollination of research findings between prompting and multi-agent domains, while providing new directions for improving both the design and training of future LLM systems.
Download

Paper Nr: 417
Title:

Top-Push Polynomial Ranking Embedded Dictionary Learning for Enhanced Re-Id

Authors:

Ying Chen, De Cheng, Zhihui Li and Andy Song

Abstract: Person re-identification (Re-Id) aims to match pedestrians captured by multiple non-overlapping cameras. In this paper, we introduce a novel dictionary learning approach enhanced with a top-push polynomial ranking metric for improved Re-Id performance. A key feature of our method is the incorporation of a ranking graph Laplacian term, designed to minimize intra-class compactness and maximize inter-class dispersion. Specifically, we employ a polynomial distance function to evaluate similarity between person images and propose the Top-push Polynomial Ranking Loss (TPRL) function, which enforces a margin between positive matching pairs and their closest non-matching pairs. The TPRL is then embedded into the dictionary learning objective, enabling our method to capture essential ranking relationships among person images—a critical aspect for retrieval-focused tasks. Unlike traditional dictionary learning approaches, our method reformulates ranking constraints through a graph Laplacian, resulting in an approach that is both straightforward to implement and highly effective. Extensive experiments on four popular Re-Id benchmark datasets demonstrate that our method consistently outperforms existing approaches, highlighting its effectiveness and robustness.
Download

Paper Nr: 422
Title:

Efficient Multi-Agent Exploration in Area Coverage Under Spatial and Resource Constraints

Authors:

Maram Hasan and Rajdeep Niyogi

Abstract: Efficient exploration in multi-agent Coverage Path Planning (CPP) is challenging due to spatial, resource, and communication constraints. Traditional reinforcement learning methods often struggle with agent coordination and effective policy learning in such constrained environments. This paper presents a novel end-to-end multi-agent reinforcement learning (MARL) framework for area coverage tasks, leveraging the centralized training and decentralized execution (CTDE) paradigm with enriched tensor-based observations and curiosity-based intrinsic rewards, which encourage agents to explore under-visited regions, enhancing coverage efficiency and learning performance. Additionally, prioritized experience adaptation accelerates convergence by focusing on the most informative experiences, improving policy robustness. By integrating these components, the proposed framework facilitates adaptive exploration while adhering to the spatial, resource, and operational constraints inherent in CPP tasks. Experimental results demonstrate superior performance over traditional approaches in coverage tasks under variable configurations.
Download

Paper Nr: 428
Title:

Toward a Quantum Fuzzy Approach for Emotion Modeling in Parent-Child Interactivity

Authors:

Cecília Botelho, Larissa Schonhofen, Helida Santos, Giancarlo Lucca, Adenauer Correa Yamin and Renata Hax Sander Reiser

Abstract: This study presents an integrated framework combining Quantum Fuzzy computing concepts with emotion modeling and simulations of intelligent agents. It explores the distinctions between Quantum Fuzzy and Classical Computing, focusing on parent-child relationships. Simulations performed on the Qiskit platform highlight significant differences in the results produced by these two approaches. The research emphasizes how membership degrees(MD) are represented in the quantum circuit model by interpreting fuzzy operations through unitary quantum transformations. Established fuzzy connectives, such as the exclusive OR, serve as an algebraic basis for constructing quantum operators and circuit representations. The algorithms demonstrate substantial potential for extension, allowing for modeling interactions among multiple agents using multidimensional quantum registers. Simulations within Qiskit offer a solid foundation for implementing these algorithms on real quantum platforms, paving the way for further exploration in this interdisciplinary field.
Download

Paper Nr: 429
Title:

A Novel LSB-based Approach for Applications and Information Security

Authors:

Dhuha Al-Adhami, Hamza Gharsellaoui and Olfa Belkahla Driss

Abstract: Two of the various techniques for securing data and preventing it from being accessed by outside parties or attackers during transmission across an open channel or any network are cryptography and steganography. This article aims to provide an overview of information security and its uses while also exploring various ways to enhance it. Also, this article presents a hybrid approach based on the Least Significant Bit (LSB) which finds a near optimal solution to the adopted problem called (NO-LSB) technique which successfully hides secret data in the image with minimal distortion and using a minimum of memory space. The paper highlights the application’s effective operation and its ability to maintain message confidentiality, serving as a practical solution for secure communication needs.

Paper Nr: 451
Title:

LLMQuoter: Enhancing RAG Capabilities Through Efficient Quote Extraction from Large Contexts

Authors:

Yuri Façanha Bezerra and Li Weigang

Abstract: We introduce LLMQuoter, a lightweight, distillation-based model designed to enhance Retrieval-Augmented Generation (RAG) by extracting the most relevant textual evidence for downstream reasoning tasks. Built on the LLaMA-3B architecture and fine-tuned with Low-Rank Adaptation (LoRA) on a 15,000-sample subset of HotpotQA, LLMQuoter adopts a “quote-first-then-answer” strategy, efficiently identifying key quotes before passing curated snippets to reasoning models. This workflow reduces cognitive overhead and outperforms full-context approaches like Retrieval-Augmented Fine-Tuning (RAFT), achieving over 20-point accuracy gains across both small and large language models. By leveraging knowledge distillation from a high-performing teacher model, LLMQuoter achieves competitive results in a resource-efficient fine-tuning setup. It democratizes advanced RAG capabilities, delivering significant performance improvements without requiring extensive model retraining. Our results highlight the potential of distilled quote-based reasoning to streamline complex workflows, offering a scalable and practical solution for researchers and practitioners alike.
Download

Paper Nr: 452
Title:

Neural Networks Bias Mitigation Through Fuzzy Logic and Saliency Maps

Authors:

Sahar Shah, Davide E. Ciucci, Sara L. Manzoni and Italo F. Zoppis

Abstract: Mitigating biases in neural networks is crucial to reduce or eliminate the predictive model’s unfair responses, which may arise from unbalanced training, defective architectures, or even social prejudices embedded in the data. This study proposes a novel and fully differentiable framework for mitigating neural network bias using Saliency Maps and Fuzzy Logic. We focus our analysis on a simulation study for recommendation systems, where neural networks are crucial in classifying job applicants based on relevant and sensitive attributes. Leveraging the interpretability of a set of Fuzzy implications and the importance of features attributed by Saliency Maps, our approach penalizes models when they overly rely on biased predictions during training. In this way, we ensure that bias mitigation occurs within the gradient-based optimization process, allowing efficient model training and evaluation.
Download

Paper Nr: 456
Title:

Time Series Prediction Models for Diabetes: A Systematic Literature Review

Authors:

Wissem Mbarek, Nesrine Khabou, Lotfi Souifi and Ismael Bouassida Rodriguez

Abstract: Diabetes is a highly prevalent chronic disease that imposes significant health and economic burdens globally. Early and accurate prediction, along with timely intervention, is crucial to prevent or delay the onset of diabetes and its complications. Various techniques have been used to forecast this disease, one of them is time series analysis, which has shown promise in the field of diabetes research prediction. This comprehensive review examines the existing literature on time series prediction models for diabetes, identifying the various machine learning and statistical methods employed, including recurrent neural networks, long short-term memory networks, integrated auto-regressive moving average models and hybrid approaches. The review highlights key time series parameters, such as glucose levels, insulin dosage, diet, physical activity, and other physiological metrics, that significantly impact predictive precision and overall performance of these models. The findings of this review provide valuable insight into the current state of time series prediction models for diabetes, underscoring the strengths and limitations of each approach.
Download

Paper Nr: 457
Title:

Predictive Modelling of Agricultural Factors to Maximize Crop Yield

Authors:

Anvesha Nayak, Pramathi Vummadi, Apoorva Raj, Nasam Saimani and Suresh Jamadagni

Abstract: Crop yield prediction and factor analysis are methods through which technology can be utilized to improve the quality of current Agricultural practices. This study focuses on improving crop yields based on different factors and ascertaining how climate change affects these factors and their prediction. The aim is to create a tool for farmers to practice precision agriculture and to be made aware of what controllable factors can lead to better yield. The study proposes a three-step methodology for this process. First, we will analyse past years' data and also take into consideration the impact of climate change to know how this relates to these variables as well as crop yield. Secondly, we suggest some spatial variable management practices that could improve the overall agricultural output. Along with that, preventative measures to ensure crop safety are also suggested. Regular updates on these spatial variables will play an important role in helping the farmer make key decisions during the life cycle of the crop. Finally, in the third step of this process, we aim to perform anomaly analysis on pests, weeds, diseases, and climatic anomalies, and suggest relevant countermeasures to the farmer.
Download

Paper Nr: 479
Title:

A Study on Vulnerability Explanation Using Large Language Models

Authors:

Lucas B. Germano and Julio Cesar Duarte

Abstract: In the quickly advancing field of software development, addressing vulnerabilities with robust security measures is essential. While much research has focused on vulnerability detection using Large Language Models (LLMs), limited attention has been given to generating actionable explanations. This study explores the capability of LLMs to explain vulnerabilities in Java code, structuring outputs into four dimensions: why the vulnerability exists, its dangers, how it can be exploited, and mitigation recommendations. In this context, smaller LLMs struggled to produce outputs in the required JSON format, with CodeGeeX4 showing high semantic similarity to GPT-4o but generating many incorrect formats. CodeLlama 34B emerged as the best overall performer, balancing output quality and formatting consistency. Despite these findings, comparisons with the GPT-4o baseline revealed no significant differences to rank the models effectively. Human evaluation further revealed that all models, including GPT-4o, struggled to adequately explain complex vulnerabilities, underscoring the challenges in achieving comprehensive explanations.
Download

Paper Nr: 482
Title:

A Collaborative Approach to Multimodal Machine Translation: VLM and LLM

Authors:

Amulya Ratna Dash and Yashvardhan Sharma

Abstract: With advancements in Large Language Models (LLMs) and Vision Language Pretrained Models (VLMs), there is a growing need to evaluate their capabilities and research methods to use them together for vision language tasks. This study focuses on using VLM and LLM collaboratively for Multimodal Machine Translation (MMT). We finetune LLaMA-3 to use provided image captions from VLMs to disambiguate and generate accurate translations for MMT tasks. We evaluate our novel approach using the German, French and Hindi languages, and observe consistent translation quality improvements. The final model shows an improvement of +3 BLEU score against the baseline and +4 BLEU score against the state-of-the-art model.
Download

Paper Nr: 483
Title:

Dataset Generation for Egyptian Arabic Sign Language

Authors:

Mariam Ibrahim, Milad Ghantous and Nada Sharaf

Abstract: This literature review explores the existing body of work related to Egyptian Arabic Sign Language (EASL) datasets, focusing on translation and text-to-video alignment, and examining relevant hand and face landmark detection methodologies, including the use of skeletal joint point analysis. With a particular emphasis on the research gaps in datasets, alignment accuracy, and computer vision models tailored for Arabic dialects, this review aims to highlight the limitations and challenges within current literature. Despite advancements in general sign language research, EASL remains understudied, leaving significant gaps in the development of resources and tools for accurate gesture translation and synchronization. The review concludes by identifying the need for dialect-specific resources and advanced alignment techniques to support the growth of accessible, region-specific sign language datasets.
Download

Paper Nr: 486
Title:

Comparative Analysis of CNNs and Vision Transformer Models for Brain Tumor Detection

Authors:

Safa Jraba, Mohamed Elleuch, Hela Ltifi and Monji Kherallah

Abstract: Brain tumors are irregular cell mixtures existing within the brain or central spinal canal. They could be cancerous or benign. The likelihood of the best possible prognosis and therapy increases with the speed and accuracy of detection. This work provides a method for detecting brain tumors that combines the capabilities of vision transformers and CNNs. In contrast to other studies that primarily relied on standalone CNN or ViT architectures, our method uniquely integrates these models with a Support Vector Machine classifier for the improvement of accuracy and robustness in medical image classification. While the ViT makes it possible to combine CNN and ViT to improve the accuracy of medical imaging of the disease, the CNN extracts hierarchical features. In-depth analyses of benchmark datasets pertaining to imaging modalities and clinical perspectives were conducted. According to the experimental findings, ViT and EfficientNet identified tumors with an accuracy of 98%, while the greatest reported accuracy of 98.3% was obtained when ViT was combined with an SVM classifier. Our findings suggest that our method may improve brain tumor detection methods.
Download

Paper Nr: 487
Title:

When Are 1.58 Bits Enough? A Bottom-up Exploration of Quantization-Aware Training with Ternary Weights

Authors:

Jacob Nielsen, Lukas Galke and Peter Schneider-Kamp

Abstract: Contemporary machine learning models, such as language models, are powerful, but come with immense resource requirements both at training and inference time. Quantization aware pre-training with ternary weights (1.58 bits per weight) has shown promising results in decoder-only language models and facilitates memory-efficient inference. However, little is known about how quantization-aware training influences the training dynamics beyond such Transformer-based decoder-only language models. Here, we engage in a bottom-up exploration of quantization-aware training, starting with multi-layer perceptrons and graph neural networks. Then, we explore 1.58-bit training in other transformer-based language models: encoder-only and encoder-decoder models. Our results show that in all of these settings, 1.58-bit training is on par with standard 32/16-bit models, yet we also identify challenges specific to 1.58-bit encoder-decoder models. Our results on decoder-only language models hint at a possible regularization effect introduced by quantization-aware training.
Download

Paper Nr: 489
Title:

Explainable AI in Labor Market Applications

Authors:

Gabriel Bicharra Santini Pinto, Carlos Eduardo Mello and Ana Cristina Bicharra Garcia

Abstract: The adoption of artificial intelligence (AI) applications has been accelerating in the labor market, driving productivity gains, scalability, and efficiency in human resource management. This progress has also raised concerns about AI’s negative impacts, such as flawed decisions, biases, and inaccurate recommendations. In this context, explainable AI (XAI) plays a crucial role in enhancing users’ understanding, satisfaction, and trust. This systematic review provides a segmented overview of explainability methods applied in the labor market. A total of 266 eligible studies were identified during the search and evaluation process, with 29 studies selected for in-depth analysis. The review highlights the different explainability requirements expressed by users of human resource systems. Additionally, it identifies the processes, tasks, and corresponding explain-ability methods implemented.
Download

Paper Nr: 502
Title:

FedKD4DD: Federated Knowledge Distillation for Depression Detection

Authors:

Aslam Jlassi, Afef Mdhaffar, Mohamed Jmaiel and Bernd Freisleben

Abstract: Depression affects over 280 million people globally and requires timely, accurate intervention to mitigate its effects. Traditional diagnostic methods often introduce delays and privacy concerns due to centralized data processing and subjective evaluations. To address these challenges, we propose a smartphone-based approach that uses federated learning to detect depressive episodes through the analysis of spontaneous phone calls. Our proposal protects user privacy by retaining data locally on user devices (i.e., smartphones). Our approach addresses catastrophic forgetting through the use of knowledge distillation, enabling efficient storage and robust learning. The experimental results demonstrate reasonable accuracy with minimal resource consumption, highlighting the potential of privacy-preserving AI solutions for mental health monitoring.
Download

Area 2 - Agents

Full Papers
Paper Nr: 19
Title:

Would Microsoft Azure Stream Analytics Be a Suitable Foundation for an Event Processing Network Model?

Authors:

Arne Koschel, Anna Pakosch, Christin Schulze, Irina Astrova, Christian Gerner and Matthias Tyca

Abstract: This article looks at a proposed list of generalized requirements for a unified modelling of event processing networks (EPNs) and its application to Microsoft Azure Stream Analytics. It enhances our previous work in this area, in which we recently analyzed Apache Storm, Amazon Kinesis Data Analytics and earlier also the EPiA model, the BEMN model, and the RuleCore model. Our proposed EPN requirements look at both: The logical model of EPNs and the concrete technical implementation of them. Therefore, our article provides requirements for EPN models based on attributes derived from event processing in general as well as existing models. Moreover, as its core contribution our article applies those requirements by an in depth analysis of Microsoft Azure Stream Analytics as a concrete implementation foundation of an EPN model.
Download

Paper Nr: 28
Title:

Swarm Behavior Cloning

Authors:

Jonas Nüßlein, Maximilian Zorn, Philipp Altmann and Claudia Linnhoff-Popien

Abstract: In sequential decision-making environments, the primary approaches for training agents are Reinforcement Learning (RL) and Imitation Learning (IL). Unlike RL, which relies on modeling a reward function, IL leverages expert demonstrations, where an expert policy πe (e.g., a human) provides the desired behavior. Formally, a dataset D of state-action pairs is provided: D = (s,a = πe(s)). A common technique within IL is Behavior Cloning (BC), where a policy π(s) = a is learned through supervised learning on D. Further improvements can be achieved by using an ensemble of N individually trained BC policies, denoted as E = {πi(s)}1≤i≤N. The ensemble’s action a for a given state s is the aggregated output of the N actions: a = 1 N ∑i πi(s). This paper addresses the issue of increasing action differences—the observation that discrepancies between the N predicted actions grow in states that are underrepresented in the training data. Large action differences can result in suboptimal aggregated actions. To address this, we propose a method that fosters greater alignment among the policies while preserving the diversity of their computations. This approach reduces action differences and ensures that the ensemble retains its inherent strengths, such as robustness and varied decision-making. We evaluate our approach across eight diverse environments, demonstrating a notable decrease in action differences and significant improvements in overall performance, as measured by mean episode returns.
Download

Paper Nr: 36
Title:

MEDIATE: Mutually Endorsed Distributed Incentive Acknowledgment Token Exchange

Authors:

Philipp Altmann, Katharina Winter, Michael Kölle, Maximilian Zorn and Claudia Linnhoff-Popien

Abstract: Recent advances in *multi-agent systems* (MAS) have shown that incorporating *peer incentivization* (PI) mechanisms vastly improves cooperation. Especially in social dilemmas, communication between the agents helps to overcome sub-optimal Nash equilibria. However, incentivization tokens need to be carefully selected. Furthermore, real-world applications might yield increased privacy requirements and limited exchange. Therefore, we extend the PI protocol for *mutual acknowledgment token exchange* (MATE) and provide additional analysis on the impact of the chosen tokens. Building upon those insights, we propose *mutually endorsed distributed incentive acknowledgment token exchange* (MEDIATE), an extended PI architecture employing automatic token derivation via decentralized consensus. Empirical results show the stable agreement on appropriate tokens yielding superior performance compared to static tokens and state-of-the-art approaches in different social dilemma environments with various reward distributions.
Download

Paper Nr: 99
Title:

Trust-Based Multi-Agent Authentication Decision Process for the Internet of Things

Authors:

Marc Saideh, Jean-Paul Jamont and Laurent Vercouter

Abstract: In the Internet of Things (IoT), systems often operate in open and dynamic environments composed of heterogeneous objects. Deploying a multi-agent system in such environments requires agents to interact with new agents and use their information and services. These interactions and resulting dependencies create vulnerabilities to malicious behaviors, highlighting the need for a robust trust management system. Multi-agent trust management models rely on observations of the behavior of other agents who must be authenticated. However, traditional authentication systems face significant limitations in adapting to diverse contexts and addressing the hardware constraints of the IoT. This paper proposes a novel trust-based multi-agent adaptive decision-making process for authentication in the IoT. Our approach dynamically adjusts authentication decisions based on the context and trustworthiness of the agent being authenticated, thereby balancing resource use for authentication with security needs and ensuring a more adaptable authentication process. We evaluate our model in a multi-agent navigation simulation, demonstrating its effectiveness for security and resource efficiency.
Download

Paper Nr: 182
Title:

MIRSim-RL: A Simulated Mobile Industry Robot Platform and Benchmarks for Reinforcement Learning

Authors:

Qingkai Li, Zijian Ma, Chenxing Li, Yinlong Liu, Tobias Recker, Daniel Brauchle, Jan Seyler, Mingguo Zhao and Shahram Eivazi

Abstract: The field of mobile robotics has undergone a transformation in recent years due to advances in manipulation arms. One notable development is the integration of a 7-degree robotic arm into mobile platforms, which has greatly enhanced their ability to autonomously navigate while simultaneously executing complex manipulation tasks. As such, the key success of these systems heavily relies on continuous path planning and precise control of arm movements. In this paper, we evaluate a whole-body control framework that tackles the dynamic instabilities associated with the floating base of mobile platforms in a simulation closely modeling real-world configurations and parameters. Moreover, we employ reinforcement learning to enhance the controller’s performance. We provide results from a detailed ablation study that shows the overall performance of various RL algorithms when optimized for task-specific behaviors over time. Our experimental results demonstrate the feasibility of achieving real-time control of the mobile robotic platform through this hybrid control framework.
Download

Paper Nr: 190
Title:

Improving Temporal Knowledge Graph Forecasting via Multi-Rewards Mechanism and Confidence-Guided Tensor Decomposition Reinforcement Learning

Authors:

Nam Le, Thanh Le and Bac Le

Abstract: Temporal knowledge graph reasoning, which has received widespread attention in the knowledge graph research community, is a task that predicts missing facts in data. When framed as a problem of forecasting future events, it becomes more challenging than the conventional completion task. Reinforcement learning is one of the potential techniques to address these challenges. Specifically, an agent navigates through a historical snapshot of a knowledge graph to find answers to the input query. However, these learning frameworks suffer from two main drawbacks: (1) a simplistic reward function and (2) candidate action selection being influenced by data sparsity issues. To address these problems, we propose a multi-reward function that integrates binary, adjusted path-based, adjusted ground truth-based, and high-frequency rule rewards to enhance the agent’s performance. Furthermore, we incorporate recent advanced tensor decomposition methods such as TuckER, ComplEx, and LowFER to construct a reliability evaluation module for candidate actions, allowing the agent to make more reliable action choices. Our empirical results on benchmark datasets demonstrate significant improvements in performance while preserving computational efficiency and requiring fewer trainable parameters.
Download

Paper Nr: 208
Title:

Enhancing Graph Clustering in Dynamic Networks with Distributed Online Life-Long Learning

Authors:

Hariprasauth Ramamoorthy, Rajkumar Vaidyanathan and Suresh Sundaram

Abstract: Trust and reputation assessment are critical in dynamic environments like recommendation systems, biological network and social networks. Malicious agents tend to collude to manipulate the reputation for selfish reasons. However, traditional methods struggle to adapt to the evolving relationships and interactions within these networks. This paper introduces a novel approach that integrates Distributed Online Life-Long Learning (DOL3) with graph clustering to address the challenge of collusion. By enabling agents to continuously learn and update their clustering models, our approach enhances the system’s ability to detect malicious agents, maintain trust, and ensure the integrity of reputation scores. We present a detailed mathematical formulation of our algorithm, incorporating local clustering models, distributed consensus, and model adaptation. Experimental results on the Cora dataset demonstrate the superior performance of our approach compared to existing methods, particularly in terms of accuracy (by 11.8%) and adaptability to dynamic and complex network scenarios. The accuracy is measured using Normalized Mutual Information (NMI), a robust metric for comparing predicted and actual cluster assignments. Our findings highlight the effectiveness of DOL3-enhanced graph clustering in addressing the challenges of trust and reputation assessment in dynamic environments.
Download

Paper Nr: 216
Title:

Multi-Face Emotion Detection for Effective Human-Robot Interaction

Authors:

Mohamed Ala Yahyaoui, Mouaad Oujabour, Leila Ben Letaifa and Amine Bohi

Abstract: The integration of dialogue interfaces in mobile devices has become ubiquitous, providing a wide array of services. As technology progresses, humanoid robots designed with human-like features to interact effectively with people are gaining prominence, and the use of advanced human-robot dialogue interfaces is continually expanding. In this context, emotion recognition plays a crucial role in enhancing human-robot interaction by enabling robots to understand human intentions. This research proposes a facial emotion detection interface integrated into a mobile humanoid robot, capable of displaying real-time emotions from multiple individuals on a user interface. To this end, various deep neural network models for facial expression recognition were developed and evaluated under consistent computer-based conditions, yielding promising results. Afterwards, a trade-off between accuracy and memory footprint was carefully considered to effectively implement this application on a mobile humanoid robot.
Download

Paper Nr: 279
Title:

Integrating Traditional Technical Analysis with AI: A Multi-Agent LLM-Based Approach to Stock Market Forecasting

Authors:

Michał Wawer and Jarosław A. Chudziak

Abstract: Traditional technical analysis methods face limitations in accurately predicting trends in today’s complex financial markets. This paper introduces ElliottAgents, an multi-agent system that integrates the Elliott Wave Principle with AI for stock market forecasting. The inherent complexity of financial markets, characterized by non-linear dynamics, noise, and susceptibility to unpredictable external factors, poses significant challenges for accurate prediction. To address these challenges, the system employs LLMs to enhance natural language understanding and decision-making capabilities within a multi-agent framework. By leveraging technologies such as Retrieval-Augmented Generation (RAG) and Deep Reinforcement Learning (DRL), ElliottAgents performs continuous, multi-faceted analysis of market data to identify wave patterns and predict future price movements. The research explores the system’s ability to process historical stock data, recognize Elliott wave patterns, and generate actionable insights for traders. Experimental results, conducted on historical data from major U.S. companies, validate the system’s effectiveness in pattern recognition and trend forecasting across various time frames. This paper contributes to the field of AI-driven financial analysis by demonstrating how traditional technical analysis methods can be effectively combined with modern AI approaches to create more reliable and interpretable market prediction systems.
Download

Paper Nr: 303
Title:

Spiralling Human-Inspired Exploration Algorithm with Doorway Detection

Authors:

Rasmus Borrisholt Schmidt, Andreas Sebastian Sørensen, Thor Beregaard and Michele Albano

Abstract: Exploration of unknown environments is an important task for autonomous robot swarm systems. The faster they can fully explore an area, the faster a coordinated plan can be made, or points of interest found, to support further tasks. Previous algorithms have often focused either on frontier based, or nature-inspired heuristics. We present a human-inspired exploration algorithm, Minotaur, that enables simple and efficient exploration of buildings. We studied how Minotaur and a state-of-the-art algorithm, namely The Next Frontier (TNF), perform. Minotaur follows walls to discover doorways, after which it coordinates with robots in the same room to extend the exploration to rooms accessible through the discovered doorways. Most algorithms assume either perfect communication, or line-of-sight (LOS) communication, which hinders the realism of the simulation results. We then modified an existing simulator to take into account realistic communication technologies that have limited penetration of materials through walls. Comparative experiments between Minotaur, TNF, and a simple greedy algorithm show the superiority of Minotaur when multiple robots are exploring buildings-like maps. However, when considering cave-like maps, Minotaur appears to have bad performance, but the greedy algorithm outperforms TNF, particularly when the algorithms are limited in their communication capabilities.
Download

Paper Nr: 317
Title:

Improvement of PIBT-based Solution Method for Lifelong MAPD Problems to Extend Applicable Graphs

Authors:

Toshihiro Matsui

Abstract: We address an extension of priority inheritance with backtracking (PIBT) for lifelong multiagent pickup-and-delivery (MAPD) problems that performs a swap operation integrated into the original algorithm to adapt specific extended case problems. The multiagent pathfinding (MAPF) problem has been widely studied as a basis for various practical multiagent systems. PIBT is a scalable and on-demand solution method for continuous MAPF problems, where each agent determines its next move in each time step by locally solving agent-move collisions. Since it can be applied to limited cases such as biconnected graphs, several extensions using additional techniques have been suggested. However, there are opportunities to extend the PIBT process with several techniques that can be integrated into the solution process itself. As the first step, we extend a solution method based on PIBT for lifelong MAPD problems, fundamental continuous problems, by integrating a specific swap task. We address detailed techniques, including additional management of priorities, subgoals, and states of agents. We also experimentally evaluate the proposed approach with several problem settings.
Download

Paper Nr: 319
Title:

Negotiation Dialogue System Using a Deep Learning-Based Parser

Authors:

Kenjiro Morimoto, Katsuhide Fujita and Ken Watanabe

Abstract: In recent years, there has been substantial research on negotiation dialogue agents. A notable study introduced a method that decoupled strategy from generation using dialogue acts that encapsulated the intent behind utterances. This approach has enhanced both the task success rate and the human-like quality of the generated responses. However, the rule-based implementation of the parser limits the types of sentences it can process for dialogue acts. Thus, this paper presents annotated training data based on the proposed dialogue acts and introduces a deep learning-based parser. The deep learning-based parser achieved a dialogue act classification accuracy of approximately 83% and effectively reduced the occurrence of unknown dialogue acts. Additionally, negotiation dialogue systems using deep learning-based parsers have demonstrated improved performance in terms of utility and fairness.
Download

Paper Nr: 340
Title:

Satisfiability Checking for (Strategic) Timed CTL Using IMITATOR

Authors:

Wojciech Penczek, Laure Petrucci and Teofil Sidoruk

Abstract: The satisfiability problem for Timed CTL (TCTL) is well known to be undecidable in general. Therefore, we propose a bounded approach, involving parametric encoding of all possible timed automata up to a given size (with certain restrictions). For this parametric timed automaton and an input formula, we synthesise parameter values using the IMITATOR model checker, and thus obtain a concrete model in which the formula is satisfied (if one exists within the bound). In other words, we define a partial algorithm for TCTL satisfiability checking. Moreover, we show how to represent memoryless strategies of agents with imperfect information via an auxiliary set of IMITATOR parameters, thereby applying our method also to Strategic TCTL (STCTL), the recently proposed extension of TCTL with the strategic modality. We evaluate practical feasibility on three benchmarks, including scalable instances.
Download

Paper Nr: 379
Title:

Double Q-Learning for a Simple Parking Problem: Propositions of Reward Functions and State Representations

Authors:

Przemysław Klȩsk

Abstract: We consider a simple parking problem where the goal for the learning agent is to park the car from a range of initial random positions to a target place with front and back end-points distinguished, without obstacles in the scene but with an imposed time regime, e.g. 25s. It is a sequential decision problem with a continuous state space and a high frequency of decisions to be taken. We employ the double Q-learning computational approach, using the bang–bang control and neural approximations for the Q functions. Our main focus is laid on the design of rewards and state representations for this problem. We propose a family of parameterized reward functions that include, in particular, a penalty for the so-called “gutter distance”. We also study several variants of vector state representations that (apart from observing velocity and direction) relate some key points on the car with key points in the park place. We show that a suitable combination of the state representation and rewards can effectively guide the agent towards better trajectories. Thereby, the learning procedure can be carried out within a reasonably small number of episodes, resulting in high success rate at the testing stage.
Download

Paper Nr: 399
Title:

Impact of Pinging in Financial Markets: An Agent Based Study

Authors:

Sriram Bharadwaj Rangarajan and Carmine Ventre

Abstract: Institutional traders in the financial markets rely on hidden trading venues to execute significantly large trades with lower execution costs and reduced information leakage. One such trading venue, known as dark pool, offers institutional traders better execution costs through hidden order books and delayed trade reporting. Despite their advantages, dark pools are susceptible to market manipulation practices such as ’pinging’. Due to low transparency in dark pools, the incentives of pinging agents and their impact on market participants has not been studied in detail. In this paper, we present an agent-based model of the financial markets to study market impact of trading strategies and the dynamics of pinging in dark pools. We identify the scenarios and market conditions under which pinging is a profitable manipulation strategy and compute its impact on execution costs of informed institutional trading agents. Further, we consider agent incentives and use empirical game theory to compute the equilibrium state of the market and quantify the additional costs imposed by pinging agents on informed traders. This study aims to bridge the existing research gap by providing a framework for analyzing market manipulation in dark pools and is a foundational step towards designing safer dark pools.
Download

Paper Nr: 419
Title:

Incentive Design in Hedonic Games with Permission Structures

Authors:

Yuta Akahoshi, Yao Zhang, Kei Kimura, Taiki Todo and Makoto Yokoo

Abstract: This paper investigates which coalition structure generation algorithms guarantee the incentive of agents to invite as many colleagues as possible in symmetric additively-separable hedonic games. We first clarify that, the incentive of invitation is not compatible with each of Nash stability and Pareto efficiency. Furthermore, we show that the worst-case ratio of social surplus achieved by any algorithm satisfying the incentive of invitation, compared to the best possible social surplus, is unboundedly small. We then introduce two problem restrictions to achieve somewhat positive results. More specifically, we showed that, when the utility graph of a hedonic game only contains three values, {−p,0, p}, for some positive number p, there exists a polynomial-time algorithm to achieve both the incentive of invitation and 1/n-approximation with respect to the social surplus.
Download

Paper Nr: 423
Title:

TRIZ Agents: A Multi-Agent LLM Approach for TRIZ-Based Innovation

Authors:

Kamil Szczepanik and Jarosław A. Chudziak

Abstract: TRIZ, the Theory of Inventive Problem Solving, is a structured, knowledge-based framework for innovation and abstracting problems to find inventive solutions. However, its application is often limited by the complexity and deep interdisciplinary knowledge required. Advancements in Large Language Models (LLMs) have revealed new possibilities for automating parts of this process. While previous studies have explored single LLMs in TRIZ applications, this paper introduces a multi-agent approach. We propose an LLM-based multi-agent system, called TRIZ agents, each with specialized capabilities and tool access, collaboratively solving inventive problems based on the TRIZ methodology. This multi-agent system leverages agents with various domain expertise to efficiently navigate TRIZ steps. The aim is to model and simulate an inventive process with language agents. We assess the effectiveness of this team of agents in addressing complex innovation challenges based on a selected case study in engineering. We demonstrate the potential of agent collaboration to produce diverse, inventive solutions. This research contributes to the future of AI-driven innovation, showcasing the advantages of decentralized problem-solving in complex ideation tasks.
Download

Paper Nr: 425
Title:

Construction of Football Agents by Inverse Reinforcement Learning Using Relative Positional Information Among Players

Authors:

Daiki Wakabayashi, Tomoaki Yamazaki and Kouzou Ohara

Abstract: Recent advancements in reinforcement learning have made it possible to develop football agents that autonomously emulate the behavior of human players. However, it is still challenging for existing methods to successfully replicate realistic player behaviors. In fact, agents exhibit behaviors like clustering around the ball or shooting prematurely. One cause of this problem lies in reward functions that always assign large rewards to certain actions, such as scoring a goal, regardless of the situation, which bias agents towards high-reward actions. In this study, we incorporate the relative positional reward and the positional weight for shooting into the reward function used for reinforcement learning. The relative positional reward, derived from the positions of players, the ball, and the goal, is estimated using inverse reinforcement learning on a dataset of real football games. The positional weight for shooting is similarly based on actual shooting positions observed in these games. Through experiments on a dataset derived from real football games, we demonstrate that the relative positional reward helps align the agents’ behaviors more closely with those of human players.
Download

Paper Nr: 491
Title:

Formal Analysis of Deontic Logic Model for Ethical Decisions

Authors:

Krishnendu Ghosh and Channing Smith

Abstract: Ethical decision making is key in the certification of autonomous system. Modeling and verification of the actions of an autonomous system becomes imperative. An automated model abstraction for an autonomous system is constructed based on components of deontic logic such as obligation, permissible, and forbidden actions. Temporal logic queries have been formulated and posed as queries to evaluate for ethical decision making. A prototype of the formalism is constructed and model checking is performed. Experiments were conducted to evaluate the computational feasibility of the formalism. The experimental results are presented.
Download

Paper Nr: 504
Title:

Investigation of MAPF Problem Considering Fairness and Worst Case

Authors:

Toshihiro Matsui

Abstract: We investigate multiagent pathfinding problems that improve fairness and the worst case among multiple objective values for individual agents or facilities. Multiagent pathfinding (MAPF) problems have been widely studied as a fundamental class of problem in multiagent systems. A common objective to be optimized in MAPF problem settings is the total cost value of the moves and actions for all agents. Another optimization criterion is the makespan, which is equivalent to the maximum cost value for all agents in a single instance of MAPF problems. As one direction of extended MAPF problems, multiple-objective problems have been studied. In general, multiple objectives represent different types of characteristics to be simultaneously optimized for a solution that is a set of agents' paths in the case of MAPF problems, and Pareto optimality is regarded as a common criterion. Here, we focus on an optimization criterion related to fairness and the worst case among the agents themselves or the facilities affected by the agents' plans, and this is also a subset of the makespan criterion. This involves several situations where the utilities/costs, including robots' lifetimes and related humans' workloads, should be balanced among individual robots or facilities without employing external payments. The applicability of these types of criteria has been investigated in several optimization problems, including distributed constraint optimization problems, multi-objective reinforcement learning, and single-agent pathfinding problems. In this study, we address the case of MAPF problems and experimentally analyze the proposed approach to reveal its effect, as well as related issues, in this class of problems.
Download

Paper Nr: 505
Title:

What Kind of Information Is Needed? Multi-Agent Reinforcement Learning that Selectively Shares Information from Other Agents

Authors:

Riku Sakagami and Keiki Takadama

Abstract: Since agents' learning affects others' learning in multi-agent reinforcement learning (MARL), this paper aims to clarify what kind of information helps to improve learning of agents throgh complex interactions among them. For this purpose, this paper focuses on the information on observations/actions of other agents and analyzes its effect in MARL with the centralized training with decentralized execution (CTDE), which contributes to stabilizing agents' learning. Concretely, this paper extends the conventional MARL algorithm with CTDE (i.e., MADDPG in this research) to have the two mechanisms, each of which shares (i) information on observations of all agents; (ii) information on actions of all agents; and (iii) information on both the observations and actions of the selected agents. MARL with these three mechanisms is compared with MADDPG which shares information on both actions and observations of all agents and IDDPG which does not share any information. The experiments on multi-agent particle environments (MPEs) have revealed that the proposed method that selectively shares both observation and action information is superior to the other methods in both the full and partial observation environments where information on observations of all and selected agents.
Download

Short Papers
Paper Nr: 29
Title:

Optimizing Sensor Redundancy in Sequential Decision-Making Problems

Authors:

Jonas Nüßlein, Maximilian Zorn, Fabian Ritz, Jonas Stein, Gerhard Stenzel, Julian Schönberger, Thomas Gabor and Claudia Linnhoff-Popien

Abstract: Reinforcement Learning (RL) policies are designed to predict actions based on current observations to maximize cumulative future rewards. In real-world, i.e. not simulated, environments, sensors are essential for measuring the current state and providing the observations on which RL policies rely to make decisions. A significant challenge in deploying RL policies in real-world scenarios is handling sensor dropouts, which can result from hardware malfunctions, physical damage, or environmental factors like dust on a camera lens. A common strategy to mitigate this issue is to use backup sensors, though this comes with added costs. This paper explores the optimization of backup sensor configurations to maximize expected returns while keeping costs below a specified threshold, C. Our approach uses a second-order approximation of expected returns and includes penalties for exceeding cost constraints. The approach is evaluated across eight OpenAI Gym environments and a custom Unity-based robotic environment (RobotArmGrasping). Empirical results demonstrate that our quadratic program effectively approximates real expected returns, facilitating the identification of optimal sensor configurations.
Download

Paper Nr: 42
Title:

Privacy-Preserving Self-Organization in Distributed Energy Scheduling

Authors:

Joerg Bremer and Sebastian Lehnhoff

Abstract: Negotiation among agents that are controlling and orchestrating a set of distributed processes often relies on frequent data exchange to allow solution evaluation and thus convergence towards a joint solution. Solving decentralized coordination problems with coalitions of agents that exchange messages and information to build beliefs for problem solving, inevitably allows insight into other agents’ operational options. Keeping local information private is thus of utmost importance for a wide user acceptance of such algorithms. We present an extension to a distributed, self-organizing algorithm for energy scheduling in virtual power plants or energetic neighborhoods that keeps all information about possible operations of participating energy resources private. For calculations during optimization the algorithm relies on secret sharing and joint multi-party computations. We evaluate the algorithm against the original non privacy-preserving standard version and present some insights for future work.
Download

Paper Nr: 49
Title:

A New Planning Agent Architecture that Efficiently Integrates an Online Planner with External Legal and Ethical Checkers

Authors:

Hisashi Hayshi, Yousef Taheri, Kanae Tsushima, Gauvain Bourgne, Jean-Gabriel Ganascia and Ken Satoh

Abstract: Transferring and using datasets online presents significant legal and ethical challenges, including issues related to privacy, safety, and bias. Careful planning is essential for compliance with the diverse legal frameworks and ethical standards of different countries. In our approach, legal and ethical checkers are implemented as independent modules capable of operating on separate servers if necessary. This structure is logical given the specialized knowledge required to express legal and ethical norms specific to each country. This paper describes the integration of a planning agent that employs an online Hierarchical Task Network (HTN) planner with these legal and ethical checkers. It also introduces, assesses, and compares three different interaction modes between these modules to facilitate efficient online legal and ethical planning. The assessment emphasizes interaction frequency and computation time, with scenarios related to international data transfer and usage demonstrating the effectiveness of the proposed approach. By exploring these interaction modes, the paper aims to provide a robust framework for managing the complexities of adhering to diverse legal and ethical requirements in a global context.
Download

Paper Nr: 97
Title:

Ethics of Autonomous Vehicles: Australians’ Expectations and Moral Preferences

Authors:

Amir Rafiee

Abstract: Autonomous Vehicles (AVs) can handle most driving scenarios, but ensuring safety in every situation remains a challenge. Factors such as technology failures, faulty sensors, and adverse weather introduce complex ethical dilemmas that AVs must navigate. Considering the societal benefits of AVs, it is crucial to address both technical challenges and ethical expectations. This paper evaluates Australians’ perceptions and expectations regarding the ethical programming of personal AVs in six dilemma scenarios using a structured questionnaire. The participants selected the most acceptable outcome in each scenario, informed by ethical and legal considerations. The survey offers a framework for understanding public moral preferences by excluding discriminatory factors and considering legal contexts. The findings prioritise Australians’ preferences for ethical AV behaviour, focusing on Injury Over Sacrifice (IOS), Harm Confinement and Lawfulness (HCL), and Harm Prevention and Prioritisation (HPP). These insights can guide policymakers and manufacturers in aligning AV programming with societal values. The study also highlights how ethical models like the Objective Decision System (ODS), which selects outcomes randomly when no clear moral preference emerges, can balance public trust and responsibility in AVs.
Download

Paper Nr: 121
Title:

Swarm Intelligence-Based Algorithm for Workload Placement in Edge-Fog-Cloud Continuum

Authors:

Kefan Wu, Abdorasoul Ghasemi and Melanie Schranz

Abstract: This paper addresses the workload placement problem in the edge-fog-cloud continuum. We model the edge-fog-cloud computing continuum as a multi-agent framework consisting of networked resource supply and demand agents. Inspired by the swarm intelligence behavior of the ant colony optimization, we propose a workload scheduler for the arriving demand agents to increase local resource utilization and reduce communication costs without relying on a centralized scheduler. Like the ants, the demand agents will release pheromones on the resource agent to indicate the available resources. The next arriving demand agent will most probably choose a neighbor, following the pheromone value and communication cost. The framework’s performance is evaluated in terms of local resource utilization, dependency on fog and cloud, and communication cost. We compare these metrics for the ant-inspired algorithm with random and greedy algorithms. The simulation results reveal that the proposed algorithm inspired by swarm intelligence can increase resource utilization at the edge and reduce the dependency on higher layers, while also decreasing the communication cost for the task of resource allocation.
Download

Paper Nr: 122
Title:

Coordinated Self-Exploration for Self-Adaptive Systems in Contested Environments

Authors:

Saad Sajid Hashmi, Hoa Khanh Dam, Alan Colman, Anton V. Uzunov, Quoc Bao Vo, Mohan Baruwal Chhetri and James Dorevski

Abstract: Enhancing the resilience and flexibility of distributed software systems is critical in challenging environments where adversaries can actively undermine performance and system operations. One approach for achieving resilience is to employ collections of intelligent software agents that can autonomously execute management actions and adapt a target system according to pre-defined goals, thereby realising various self-* properties. Self-exploration is one such self-* property, relating to a system’s ability to compute resilient responses through an adversarial game-tree search process that takes into account an adversary’s action-effects on goals. Unlike the current realisation of self-exploration that assumes goal independence, we propose a novel approach that addresses goal inter-dependencies through agent coordination, ensuring more realistic and effective counter-responses. We provide a correctness proof and evaluate the performance of our algorithm.
Download

Paper Nr: 125
Title:

Who Knows Who: A Context-Based Approach to Network Generation for Social Simulations

Authors:

Veronika Kurchyna, Philipp Flügger, Ye Eun Bae, Jan Ole Berndt and Ingo J. Timm

Abstract: Established network models (small-world, preferential attachment, random, etc.) often fail to capture the full range of characteristics observed in real social networks, potentially limiting the transferability of model results. To address this limitation, we propose a Contextual Network Model that is superimposed on a synthetic population. The model takes into account sociodemographic agent-traits and location data, such as workplaces and frequent leisure activities, to construct a realistic network. To showcase the effect of network topology on model dynamics, we investigate a Susceptible-Infectious-Removed (SIR) model with information diffusion by comparing our proposed network model with the aforementioned established network models. The study identifies an earlier, lower peak of infectious agents, along with a greater number of susceptible agents remaining at the end of the simulation for the proposed network model. Moreover, the study underscores the measurable impact of network topology on model behaviour, revealing different expansion rates and patterns in the information diffusion process. Additionally, this work offers instructions for a customisable implementation of a contextual network model generator for other agent-based models and populations.
Download

Paper Nr: 139
Title:

Deep Reinforcement Learning for Auctions: Evaluating Bidding Strategies Effectiveness and Convergence

Authors:

Luis Eduardo Craizer, Edward Hermann and Moacyr Alvim Silva

Abstract: This paper extends our previous work on using deep reinforcement learning, specifically the MADDPG algorithm, to analyze and optimize bidding strategies across different auction scenarios. Our current research aims to empirically verify whether the agents’ optimal policies, achieved after model convergence, approach a near-Nash equilibrium in various auction settings. We propose a novel empirical strategy that compares the learned policy of each agent, derived through the deep reinforcement learning algorithm, with an optimal bid strategy obtained via an exhaustive search based on bid points from other participants. This comparative analysis encompasses different auctions, revealing various equilibrium scenarios. Our findings contribute to a deeper understanding of decision-making dynamics in multi-agent environments and provide valuable insights into the robustness of deep reinforcement learning techniques in auction theory.
Download

Paper Nr: 152
Title:

A Computational Model of Trustworthiness: Trust-Based Interactions Between Agents in Multi Agent System

Authors:

Basten Leeftink, Britta Abbink Spaink, Tomasz Zurek and Tom Van Engers

Abstract: In our research group working on normative systems, we develop (Normative) Agent Based Models for evaluating policies, and as a basis for building distributed (normative) control components. If and how interactions between actors (represented by agents) take place are heavily impacted by the (dis)trust between those actors. In this paper, we discuss a model of the representation of the three components of the agent’s trustworthiness: competence, benevolence, and integrity. The model presented in this paper is being illustrated by a small simulation experiment.
Download

Paper Nr: 191
Title:

Pre-Trained Models and Fine-Tuning for Negotiation Strategies with End-to-End Reinforcement Learning

Authors:

Yuji Kobayashi and Katsuhide Fujita

Abstract: In the field of automated negotiation, designing negotiation strategies handling any opponents is a key goal, and end-to-end reinforcement learning methods have been proposed. However, existing methods learn for each specific agent individually, which leads to the risk of overfitting to that agent, making it difficult to adapt to different situations or strategy changes even with the same agent. In addition, there is the issue that retraining is necessary from scratch when facing unknown opponents. To address these challenges, this study proposes a method that applies pre-training and fine-tuning to the model by an end-to-end reinforcement learning framework. Through evaluations, we demonstrate that the pre-trained model exhibits high generalizability. Furthermore, we show that fine-tuning the pre-trained model not only has the potential to further improve performance but also to have the potential to obtain high performance for unknown agents.
Download

Paper Nr: 201
Title:

Combining Procedural Generation and Genetic Algorithms to Model Urban Growth

Authors:

Etienne Tack, Gilles Énée and Frédéric Flouvat

Abstract: In this paper, we present an approach to model spatial influences in multi-agent models using procedural generation and genetic algorithms. We applied this approach in an urban growth model. In agent-based simulations, the agents make decisions based on the perception of their environment. In our context, the agents represent inhabitants who can create new buildings or extend the existing ones. Their behaviour is ruled by spatial influences (e.g., the proximity of the road increases chances of building in the surrounding areas). Procedural generation provides a good framework for representing the influences of the environment on the agent’s behaviour. Each spatial feature is associated with an influence function. Their parameters search space can be tremendous, making it difficult for field experts to set them manually. Consequently, we use a genetic algorithm to optimize the parameters of these influence functions and train the model based on three spatial measures (Chamfer distance, kernel density, and a density grid). This approach can be employed likewise to any problem where the agent decisions are wholly or partly based on location. Our experiments highlight the interest of our approach and the impact of the chosen fitness functions.
Download

Paper Nr: 204
Title:

Cost-Effective Robotic Recycling Workers: From Lab Experiments to Real-World Deployment

Authors:

Nikolaos Kounalakis, Georgios Alexakis, Fredy Raptopoulos and Michail Maniadakis

Abstract: In recent years, the deployment of robots in recycling processes has gained significant attention as a transformative solution to enhance efficiency and sustainability. Most existing solutions rely on general purpose robots that have been widely used in pick-and-place applications. However, the use of generalpurpose robots for waste recovery may often be an unreasonably high cost solution because the strong features provided by the robots, in particular fast and high-precision movements, does not provide any tangible benefits to the context of waste recovery. The present work proposes the use of a much simpler architecture that we call Robotic Recycling Worker (RoReWo). This is based on a cartesian manipulator which substantially lowers implementation costs, sacrificing only a modest percentage of performance. Essentially, with the same budget required for a competent robotic arm, one can assemble a team of RoReWos that, in waste sorting applications, outperforms the given arm. The proposed RoReWo architecture comprises a linear, fully controlled cartesian manipulator designed to traverse the width of a waste-carrying belt. This setup is complemented by a cost-effective piston with binary-state functionality mounted on the cartesian carriage, facilitating rapid waste object retrieval by moving vertically towards the belt. The architecture is termed the 1.5 Degrees of Freedom robot due to the partial, binary-state control of the piston. The current work presents the RoReWo architecture and demonstrates results that highlight its heightened potential to markedly improve material recovery from waste streams.

Paper Nr: 220
Title:

Multimodal Web Agents for Automated (Dark) Web Navigation

Authors:

Mrunal Vibhute, Neol Gutierrez, Kristina Radivojevic and Paul Brenner

Abstract: Studying marketplaces hosted on the dark web is challenging due to the robust security measures these platforms use to protect user anonymity and prevent unauthorized access. While these marketplaces facilitate the trade of illegal goods and services, their use of CAPTCHAs, encryption, and the Tor network creates significant barriers for researchers attempting to gather data. We developed a software agent capable of overcoming the obstacles to automating the navigation of these marketplaces. The tool is specifically designed for ethical and legal research, helping cybersecurity experts identify and analyze dark Web activities to mitigate potential threats. Built using Python and Selenium WebDriver, and operating within the Tor Browser for anonymity, our agent uses Multimodal Large Language Models (MLLMs) to help automate the data acquisition process. These models can interpret both text and images, enabling the agent to solve complex CAPTCHAs that would otherwise block access to random bots in the marketplace. Once logged in, the agent automatically collects important data like vendor details, product categories, and prices. Additionally, the data collected in this process is publicly available as downloadable files in our GitHub repository. Our research also provides valuable insights into security trends and patterns within these marketplaces, shedding light on the activities taking place within these clandestine networks.
Download

Paper Nr: 250
Title:

Strategic Returns Prevention in E-Commerce: Simulating Financial and Environmental Outcomes Through Agent-Based Modeling

Authors:

Marie Niederlaender, Urs Liebau, Yajing Chen, Emil Breustedt, Saad Driouech and Dirk Werth

Abstract: Product returns pose an environmental and financial burden on manufacturers and online retailers worldwide, especially in the fashion sector. Over 50% of all ordered garments end up being returned, which gives rise to an ongoing search for approaches to successfully manage returns or to avoid returns in the first place. For both approaches, an accurate prediction of returns can be useful, since it allows for an improved inventory risk assessment and strategic reselling of garments, while also providing crucial information on common drivers of return rates. This study focuses on preventive strategies in the context of customers placing selection orders in online shops. An Agent based approach provides insight into the outcomes of three different return prevention strategies, which are compared with the original outcome of real world data from a German clothing manufacturer selling garments for special occasions. The four outcomes are analysed in terms of their financial and environmental impact, utilising common life cycle assessment strategies.
Download

Paper Nr: 274
Title:

Formation Analysis for a Fleet of Drones: A Mathematical Framework

Authors:

Emiliano Traversi, Michal Barcis, Lorenzo Bellone, Agata Barcis, Dina Ahmim-Bonaldi, Eliseo Ferrante and Enrico Natalizio

Abstract: We consider a dynamic coverage scenario, where a group of agents (e.g., Unmanned Aerial Vehicles (UAVs)) is exploring an environment in search of a moving target (e.g., survivors on a lifeboat). We assume UAVs are capable to achieve, maintain, and move in formation (e.g., to maintain connectivity). This paper addresses the question “Which formation maximizes the chance of finding the target?”. We propose a mathematical framework to answer this question. The proposed framework is generic and can be easily applied to various formations and missions. We show how the framework can identify which formation will result in better performance in the type of missions we consider. We analyze how different factors, namely the target speed relative to the group, affect the performance of the formations. We validate the framework against simulations of the considered scenarios. The supplementary video material including the real-world implementation is available at https://youtu.be/ mYmTnAJi-I?si=dSmVVNZOjj5NbSG1.
Download

Paper Nr: 292
Title:

GOLLUM: Guiding cOnfiguration of firewaLL Through aUgmented Large Language Models

Authors:

Roberto Lorusso, Antonio Maci and Antonio Coscia

Abstract: Artificial intelligence (AI) tools offer significant potential in network security, particularly for addressing issues like firewall misconfiguration, which can lead to security flaws. Configuration support services can help prevent errors by providing clear general-purpose language instructions, thus minimizing the need for manual references. Large language models (LLMs) are AI-based agents that use deep neural networks to understand and generate human language. However, LLMs are generalists by construction and may lack the knowledge needed in specific fields, thereby requiring links to external sources to perform highly specialized tasks. To meet these needs, this paper proposes GOLLUM, a conversational agent designed to guide firewall configurations using augmented LLMs. GOLLUM integrates the pfSense firewall documentation via a retrieval augmented generation approach, providing an example of actual use. The generative models used in GOLLUM were selected based on their performance on the state-of-the-art NetConfEval and CyberMetric datasets. Additionally, to assess the effectiveness of the proposed application, an automated evaluation pipeline, involving RAGAS as test dataset generator and a panel of LLMs for judgment, was implemented. The experimental results indicate that GOLLUM, powered by LLama3-8B, provides accurate and faithful support in three out of four cases, while achieving > 80% of answer correctness in configuration-related queries.
Download

Paper Nr: 301
Title:

Performance Analysis and Failure Mitigation Strategies for a Resilient Dynamic Evacuation Guidance System

Authors:

Akira Tsurushima

Abstract: Resilience is a critical factor in dynamic evacuation guidance systems, which must remain functional in harsh environments. However, most evacuation studies have seldom addressed system resilience. This study proposes a distributed dynamic evacuation guidance system that sustains functionality even when some components are damaged during evacuation, thereby enhancing the overall reliability and redundancy of the system by avoiding single points of failure. We evaluated the system performance through asynchronous multi-agent simulations to assess its effectiveness in maintaining guidance during a spreading fire that compromised its components. The experiments revealed that the proposed system with failed components performed comparably to a fully operational system when failures occurred in response to the fire severity. The adverse effects of random component failure were mitigated using two strategies: spatial interpolation and persistent guidance, resulting in a performance comparable to that of a failure-free system.
Download

Paper Nr: 306
Title:

Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning

Authors:

Gavin B. Rens

Abstract: Humanoid robots must master numerous tasks with sparse rewards, posing a challenge for reinforcement learning (RL). We propose a method combining RL and automated planning to address this. Our approach uses short goal-conditioned policies (GCPs) organized hierarchically, with Monte Carlo Tree Search (MCTS) planning using high-level actions (HLAs). Instead of primitive actions, the planning process generates HLAs. A single plan-tree, maintained during the agent’s lifetime, holds knowledge about goal achievement. This hierarchy enhances sample efficiency and speeds up reasoning by reusing HLAs and anticipating future actions. Our Hierarchical Goal-Conditioned Policy Planning (HGCPP) framework uniquely integrates GCPs, MCTS, and hierarchical RL, potentially improving exploration and planning in complex tasks.
Download

Paper Nr: 311
Title:

Simulative Analysis of Multi-Agent Systems in Energy Systems: Impact of Communication Networks

Authors:

Malin Radtke and Emilie Frost

Abstract: This paper addresses the growing use of Multi-Agent Systems (MASs) in power systems, particularly within the context of Cyber-Physical Energy Systems (CPES). The reliance on Information and Communication Technologies (ICT) is critical for coordinating control and ensuring reliable information exchange. Disruptions in the ICT system can degrade overall system performance. Given the complexity of these systems, systematic testing and accurate simulation of MAS behavior under the influence of communication networks are essential to ensure stability and security of supply. The paper provides a structured perspective on how to analyze MASs performance under different communication conditions in CPES, offering recommendations based on literature. It serves as a guide to understanding the challenges posed by the integration of ICT into power systems, with guidelines that can be used and extended to evaluate and improve system performance.
Download

Paper Nr: 346
Title:

PIMAEX: Multi-Agent Exploration Through Peer Incentivization

Authors:

Michael Kölle, Johannes Tochtermann, Julian Schönberger, Gerhard Stenzel, Philipp Altmann and Claudia Linnhoff-Popien

Abstract: While exploration in single-agent reinforcement learning has been studied extensively in recent years, consid-erably less work has focused on its counterpart in multi-agent reinforcement learning. To address this issue, this work proposes a peer-incentivized reward function inspired by previous research on intrinsic curiosity and influence-based rewards. The PIMAEX reward, short for Peer-Incentivized Multi-Agent Exploration, aims to improve exploration in the multi-agent setting by encouraging agents to exert influence over each other to increase the likelihood of encountering novel states. We evaluate the PIMAEX reward in conjunction with PIMAEX-Communication, a multi-agent training algorithm that employs a communication channel for agents to influence one another. The evaluation is conducted in the Consume/Explore environment, a partially observable environment with deceptive rewards, specifically designed to challenge the exploration vs. exploitation dilemma and the credit-assignment problem. The results empirically demonstrate that agents using the PI-MAEX reward with PIMAEX-Communication outperform those that do not.
Download

Paper Nr: 353
Title:

Bridging the Semantic Gap in vGOAL for Verifiable Autonomous Decision-Making

Authors:

Yi Yang and Tom Holvoet

Abstract: Verifiable autonomous decision-making requires bridging the semantic gap between the execution semantics of an agent programming language (APL) and the formal model used for verification. In this paper, we address this challenge for vGOAL, an APL derived from GOAL and designed for automated verification. We make three contributions. First, we identify the semantic gap in vGOAL: while both its interpreter and its model-checking framework implement the semantics of vGOAL, they differ in how they define the next program state for vGOAL. Second, we bridge the semantic gap by developing an improved interpreter for vGOAL that aligns with the model checker’s formal semantics, thus ensuring correct verification results. Third, we introduce a stepwise refinement approach to address potential efficiency concerns arising from this semantic alignment. Through a case study in autonomous logistics, we demonstrate that while our approach introduces additional verification overhead, the efficient model-checking framework of vGOAL keeps this overhead manageable, making our solution practical for real-world applications.
Download

Paper Nr: 403
Title:

Simultaneous Simulated Annealing-Based Crossover Within a Multi-Agent Model for Solving the Green Share-a-Ride Problem

Authors:

Elhem Elkout, Houssem Eddine Nouri and Olfa Belkahla Driss

Abstract: This research addresses the Green share-a-ride problem (Green-SARP), which is an extension of the share-a-ride problem (SARP) by considering a limited driving range of vehicles in combination with limited refueling infrastructure. The goal of Green-SARP is to remove the possibility of a vehicle running out of fuel during a route by allowing refueling at any alternative fuel station. In this work, we present a new simultaneous Simulated Annealing-based Crossover within a Multi-Agent model (SAC-MA) to solve Green-SARP. In fact, adding to the neighbor operators, the crossover operator is integrated to diversify the search allowing to explore new areas in the search space. Experimental studies are carried out in order to evaluate the performance of our approach, based on new generated data instances, allowing to show its efficiency compared to a Simulated Annealing algorithm (SA).
Download

Paper Nr: 416
Title:

Towards Holistic Approach to Robust Execution of MAPF Plans

Authors:

David Zahrádka, Denisa Mužíková, Miroslav Kulich, Jiří Švancara and Roman Barták

Abstract: Multi-agent path finding (MAPF) deals with the problem of navigating a set of agents in a shared environment to reach their destinations without collisions. Even if the plan is collision-free, some delay during plan execution may lead to collision of agents if they execute the plans blindly. In this position paper, we discuss the concept of robust execution of MAPF plans by exploiting an action dependency graph. We suggest how to evaluate the effect of delay by computing a slack-like value for not yet visited locations, and we propose a three-layer architecture – retiming, rescheduling, and replanning to handle delays effectively.
Download

Paper Nr: 438
Title:

Population Protocols for Adaptive Event Dissemination with Autonomous Agents in Vehicular Networks

Authors:

Vincenzo Agate, Farwa Batool, Antonio Bordonaro, Alessandra De Paola, Pierluca Ferraro, Giuseppe Lo Re, Marco Morana and Antonio Virga

Abstract: Recent advances in distributed vehicle-to-vehicle communication promise to transform the user’s driving experience, providing new services capable of improving safety, efficiency and quality of travelling. Due to the large amount of information exchanged, a major challenge of Vehicular Networks is the adoption of appropriate data dissemination protocols that ensure good performance in real-time event detection, while guaranteeing low communication overhead. To this aim, this paper proposes an adaptive event dissemination algorithm which exploits Population Protocols (PPs) for modelling vehicle interactions as coordinated behaviors of autonomous agents in a distributed system. The experimental evaluation performed on realistic vehicle tracks over real-world maps demonstrates the system’s ability to efficiently disseminate information in the network in order to support reliable and distributed event detection services.
Download

Paper Nr: 459
Title:

Multi-Agent System for AI-Assisted Extraction of Narrative Arcs in TV Series

Authors:

Roberto Balestri and Guglielmo Pescatore

Abstract: Serialized TV shows are built on complex storylines that can be hard to track and evolve in ways that defy straightforward analysis. This paper introduces a multi-agent system designed to extract and analyze these narrative arcs. Tested on the first season of Grey’s Anatomy (ABC 2005-), the system identifies three types of arcs: Anthology (self-contained), Soap (relationship-focused), and Genre-Specific (strictly related to the series’ genre). Episodic progressions of these arcs are stored in both relational and semantic (vectorial) databases, enabling structured analysis and comparison. To bridge the gap between automation and critical interpretation, the system is paired with a graphical interface that allows for human refinement using tools to enhance and visualize the data. The system performed strongly in identifying Anthology Arcs and character entities, but its reliance on textual paratexts (such as episode summaries) revealed limitations in recognizing overlapping arcs and subtler dynamics. This approach highlights the potential of combining computational and human expertise in narrative analysis. Beyond television, it offers promise for serialized written formats, where the narrative resides entirely in the text. Future work will explore the integration of multimodal inputs, such as dialogue and visuals, and expand testing across a wider range of genres to refine the system further.
Download

Paper Nr: 460
Title:

The Role of Architectures and Information Availability for Anomaly Detection in Cyber-Physical Energy Systems

Authors:

Emilie Frost and Astrid Nieße

Abstract: In the context of Cyber Physical Energy Systems (CPES), anomaly detection is an important requirement for dealing with the increasing threats of cyber-attacks. However, privacy or regulatory restrictions might limit the access of information for performing anomaly detection. This paper discusses different possible architectures for the development of an anomaly detection observer in CPES, in the context of information availability. Using an agent-based control use case, these architectures, information availability and their impact on anomaly detection performance are evaluated. The results shed light on the design of appropriate architectures for anomaly detection in CPES with the aim of improving the overall robustness of the system.
Download

Paper Nr: 473
Title:

AgentFlow: A Context Aware Multi-Agent Framework for Dynamic Agent Collaboration

Authors:

Gayathri Nettem, M. Disha, Aavish Gilbert J., Skanda Shreesha Prasad and S. Natarajan

Abstract: Multi-agent systems have long been recognized for their potential in solving complex problems. This paper presents a new framework that focuses on context-awareness and adaptability to tasks dynamically. Unlike traditional agentic approaches, our method involves multiple specialized agents working together, each guided by a set of strategies. These agents dynamically switch roles, utilize workflows to organize task progression, and leverage the perception loop to ensure context-informed decisions and seamless collaboration. The result is a system that consistently produces more accurate, coherent, and creative responses across a variety of domains. Empirical evaluations using benchmarks like HumanEval and MMLU show substantial improvements over single-agent and multi-agent systems.
Download

Paper Nr: 475
Title:

Consent Understanding and Verification for Personalized Assistive Systems

Authors:

Ismael Jaggi, Rachele Carli, Berk Buzcu, Michael Schumacher and Davide Calvaresi

Abstract: The rapid adoption of personalized systems, driven by advancements in natural language processing, sensor technologies, and AI, has transformed the role of virtual personal assistants (VPAs), particularly in healthcare. While VPAs promise to enhance patient experiences through tailored support and adaptive workflows, their complexity often results in opaque functionalities hindering user understanding. This lack of transparency poses significant challenges, particularly in the context of informed consent, where users must comprehend the implications of sharing sensitive personal data. Existing consent systems often rely on static declarations and extensive documentation, which overwhelm users and fail to ensure informed decision-making. To address this problem, this paper presents a novel consent management approach integrated into the EREBOTSv3.0, an agent-based GDPR-compliant explainable framework for virtual assistants. The proposed solution introduces (i) an interactive method that structures consent into clear sections with summaries and examples to improve user comprehension and (ii) a question-based verification mechanism that assesses understanding and reinforces knowledge when needed. By leveraging EREBOTS’ modular architecture, real-time feedback, and secure data management, the proposed approach enhances transparency, fosters trust, and simplifies the consent understanding for dialog-based healthcare systems. This work lays the foundation for addressing critical challenges at the intersection of personalized AI, healthcare, and data protection.
Download

Paper Nr: 68
Title:

Cooperative Evacuation Guidance Methods in Large-Scale Disaster Situations Based on Wi-Fi Sensing Data

Authors:

Atsuo Ozaki

Abstract: This study introduces a Wi-Fi packet sensor developed to acquire headcount distribution data, which are obtained by deploying several of these sensors in a large-scale event, as well as describes the results of evaluating the proposed distributed and coordinated evacuation guidance method in a disaster using multi-agent simulation. The results confirm that by balancing the guide loads, it is possible to evacuate all evacuees from the venue in a shorter time than before the addition of the load balance. Furthermore, it was confirmed that if the evacuation route indicated by the guide did not significantly change the congestion situation, it was important for evacuees to choose another exit route at their discretion.
Download

Paper Nr: 71
Title:

Addressing the Ethical Implications of AI Models Developed: A Case Study of Master's Degree Dissertations in Data Science for Industry and Society

Authors:

Alina Delia Călin

Abstract: The increase in the development and use of AI models has generated many ethical and societal concerns. In this paper, we examine the ethical element in several dissertations presented in July and September 2024 by students enroled in the Data Science for Industry and Society Master’s Degree Programme. We assess the level of awareness of ethical principles by analysing in these case studies the ethical concerns addressed by most students, the ethical principles that are mostly neglected, and possible implications for the society. The findings reveal that data bias is the most addressed concern, while accountability is the most neglected ethical principle. Some recommendations for possible improvements include the use of ethical AI tools for the design and assessment of AI models and applications.
Download

Paper Nr: 104
Title:

RoDiL: Giving Route Directions with Landmarks by Robots

Authors:

Kanta Tachikawa, Shota Akahori, Kohei Okuoka, Mitsuhiko Kimoto and Michita Imai

Abstract: For social robots, a critical aspect is the design of mechanisms for providing information that is understandable to a human recipient. In tasks such as giving route directions, robots must explain the route clearly to ensure that the user can reach the destination. However, most studies on guiding robots have assumed that the robot will only present a route from its current location without considering the route’s complexity. In this study, we propose a robot guiding system, RoDiL (Route Directions with Landmarks), that aims to guide users along a simple route by leveraging their knowledge of a city, especially when the route from the current location to the destination is complex. Specifically, within the context of user interaction, RoDiL comprehends which landmarks are familiar to the user. Subsequently, RoDiL initiates giving route directions using the landmark familiar to the user as the starting point. We conducted an experimental comparison between landmark-based guidance and non-landmark-based guidance with 100 participants. Landmark-based guidance was evaluated significantly more highly when the direct route from the current location was complex. In contrast, when the route from the current location was simple, non-landmark-based guidance was preferred. These results confirm the efficacy of the RoDiL design criteria.
Download

Paper Nr: 123
Title:

Efficient Selection of Consistent Plans Using Patterns and Constraint Satisfaction for Beliefs-Desires-Intentions Agents

Authors:

Veronika Kurchyna, Ye Eun Bae, Jan Ole Berndt and Ingo J. Timm

Abstract: Agent-based models can portray complex systems that emerge from the actions of individual actors. The use of many agents with complex decision-making processes in a large action space is computationally intensive and leads to slow simulations. This work proposes an alternative approach to agent deliberation by pre-computing valid action sequences and simplifying decision-making at runtime to a linear problem. The method is demonstrated with pandemic self-protection as use case for an implementation of the concept. Additionally, a step-by-step guideline for application of this approch is provided.
Download

Paper Nr: 126
Title:

AGENTFORGE: A Flexible Low-Code Platform for Reinforcement Learning Agent Design

Authors:

Francisco E. Fernandes Jr. and Antti Oulasvirta

Abstract: Developing a reinforcement learning (RL) agent often involves identifying values for numerous parameters, covering the policy, reward function, environment, and agent-internal architecture. Since these parameters are interrelated in complex ways, optimizing them is a black-box problem that proves especially challenging for nonexperts. Although existing optimization-as-a-service platforms (e.g., Vizier and Optuna) can handle such problems, they are impractical for RL systems, since the need for manual user mapping of each parameter to distinct components makes the effort cumbersome. It also requires understanding of the optimization process, limiting the systems’ application beyond the machine learning field and restricting access in areas such as cognitive science, which models human decision-making. To tackle these challenges, the paper presents AGENTFORGE, a flexible low-code platform to optimize any parameter set across an RL system. Available at https://github.com/feferna/AgentForge, it allows an optimization problem to be defined in a few lines of code and handed to any of the interfaced optimizers. With AGENTFORGE, the user can optimize the parameters either individually or jointly. The paper presents an evaluation of its performance for a challenging vision-based RL problem.
Download

Paper Nr: 136
Title:

Integration of Emotionally Intelligent Artificial Intelligence into Neuromarketing: Attitudes, Opportunities, Challenges

Authors:

Ana Todorova and Irina Kostadinova

Abstract: The integration of emotionally intelligent artificial intelligence into neuromarketing promises to revolutionize the way organizations interact with consumers. This research, based on a survey of 510 marketers and 708 consumers, reveals the complex picture of perceptions and expectations regarding this new technology. Although the potential benefits are significant, ethical dilemmas present a major challenge. The report analyzes the current state of the art of the latest developments in marketing and neuromarketing and thereby contributes to the development of marketing knowledge. At the same time, the authors offer guidelines for developing ethical frameworks for the use of emotionally intelligent artificial intelligence.
Download

Paper Nr: 153
Title:

Agile Software Management with Cognitive Multi-Agent Systems

Authors:

Konrad Cinkusz and Jarosław A. Chudziak

Abstract: This paper explores the integration of cognitive agents powered by Large Language Models (LLMs) into software project management within the Scaled Agile Framework (SAFe). We introduce the CogniSim framework, an ecosystem where virtual agents operate in a simulated software environment to fulfill key roles in IT project development. Emphasis is placed on the adaptability of these agents to the Scrum methodology, particularly in decision-making and problem-solving. By combining LLMs with Multi-Agent Systems (MAS), we focus on improvements in project management, development processes, and Agile methodologies. Through simulations and case studies, we demonstrate advancements in task delegation, communication, and project lifecycle management, highlighting the potential of LLM-augmented MAS to manage software projects with increased precision and intelligence. Our findings provide insights into essential components for an effective cognitive multi-agent ecosystem, including Dynamic Context techniques and Theory of Mind for enhanced agent collaboration, laying the groundwork for future research in this field.
Download

Paper Nr: 187
Title:

Stateful Monitoring and Responsible Deployment of AI Agents

Authors:

Debmalya Biswas

Abstract: AI agents can be disruptive given their potential to compose existing models and agents. Unfortunately, developing and deploying multi-agent systems at scale remains a challenging problem. In this paper, we specifically focus on the challenges of monitoring stateful agents and deploying them in a responsible fashion. We introduce a reference architecture for AI agent platforms, highlighting the key components to be considered in designing the respective solutions. From an agent monitoring perspective, we show how a snapshot based algorithm can answer different types of agent execution state related queries. On the responsible deployment aspect, we show how responsible AI dimensions relevant to AI agents can be integrated in a seamless fashion with the underlying AgentOps pipelines.
Download

Paper Nr: 192
Title:

Integration of Aggregated Information and Subjective Experience Through Sequential Information Presentation

Authors:

Yoshimasa Ohmoto and Hiroki Yamamoto

Abstract: Often, when following a pedestrian navigation system, individuals do not remember the route taken or the buildings passed upon arriving at their destination. We hypothesized that by integrating sparsely aggregated information from the problem space into the context of the user’s subjective experience, the problem space could be more comprehensively understood from the periphery of the subjective experience. In this study, we tested this hypothesis using pedestrian navigation. Specifically, we proposed a method for mapping aggregated information to subjective experience by incorporating landmarks around the user into the route guidance by a guide agent and sequentially presenting information even at non-decisive points that do not prompt a route change. Experimental results indicated significant differences in ”information organization” and ”understand-ing of urban space” in questionnaires. Significant differences were also observed in route memory. The results suggest that the proposed method facilitates the integration of aggregated information to subjective experience.
Download

Paper Nr: 228
Title:

Alexa and Copilot: A Tale of Two Assistants

Authors:

Todericiu Ioana Alexandra, Dioşan Laura and Şerban Camelia

Abstract: As virtual assistants (VAs) become essential to contemporary interactions, it is imperative to understand how to evaluate their functionalities. This study offers a comparison framework for assessing the design and execution of Amazon Alexa and Microsoft Copilot Studio, emphasizing their capabilities in question-answering activities. Through the examination of their deterministic and probabilistic approaches, we evaluate response times, precision, flexibility, and linguistic support. We have developed a systematic framework to assess the strengths and shortcomings of each VA, utilizing educational queries as a realistic test case that elucidates the influence of design decisions on performance. Our study lays the groundwork for choosing an appropriate VA according to particular needs, assisting developers and organizations in traversing the varied realm of VA technologies. Regardless of whether precision or adaptability is prioritized, our approach facilitates an educated decision, simplifying the process of aligning the appropriate VA with the corresponding circumstance.
Download

Paper Nr: 273
Title:

Social Laws for Multi-Agent Pathfinding

Authors:

Jan Slezák, Jakub Mestek and Roman Barták

Abstract: Multi-agent pathfinding (MAPF) is the problem of finding collision-free plans for navigating a set of agents from their starting positions to their destinations. Frequently, rigid plans for all agents are centrally searched and perfect execution of actions is assumed, which raises issues with the uncertainty and the dynamicity of the environment. Social laws are a well-established paradigm in multi-agent systems to coordinate agents that avoid extensive negotiation. In this work, we present a decentralized online algorithm to solve the MAPF problem without the need for communication between agents. This approach uses social laws inspired by traffic rules.
Download

Paper Nr: 282
Title:

Comparative Analysis of Simulated Annealing and Particle Swarm Optimization for Multi-Robot Task Allocation in ROS

Authors:

Dhruv Kumar Sharma, Ujjwal Singh, Snehal Nalawade and Pratik Shah

Abstract: A comparative analysis of two prominent optimization techniques—simulated annealing (SA) and particle swarm optimization (PSO)—is conducted within the framework of multi-robot systems (MRS). The research investigates how each algorithm effectively allocates tasks among multiple robots, focusing on performance metrics, convergence speed, and robustness in dynamic environments. Through extensive simulations in ROS, utilizing a dedicated testbed for real-world scenario emulation, distinct advantages and limitations of both algorithms are revealed across various setups. The testbed integrates realistic garbage generation, dynamic obstacles, and robot interactions, allowing for detailed empirical evaluations. The study highlights the practical implications of using SA and PSO for multi-robot coordination, laying the groundwork for future research on hybrid approaches and algorithmic enhancements in complex robotic applications.
Download

Paper Nr: 309
Title:

Agent-Based Computational Geometry

Authors:

Akbarbek Rakhmatullaev, Shahruz Mannan, Anirudh Potturi and Munehiro Fukuda

Abstract: Cluster computing can increase CPU and spatial scalability of computational geometry. While data-streaming tools such as Apache Sedona (we simply call Sedona) lines up built-in GIS parallelization features, they require a shift to their programming paradigm and thus a steep learning curve. In contrast, agent-based modeling is frequently used in computational geometry as agent propagation and flocking simulate spatial problems. We aim to identify if and in which GIS applications agent-based approach demonstrates its efficient paralleliz-ability. This paper compares MASS, Sedona, and MPI, each representing agent-based, data-streaming, and baseline message-passing approach to parallelizing four GIS programs. Our analysis finds that MASS demonstrates its simple programmability and yields competitive parallel performance.
Download

Paper Nr: 318
Title:

Analysis of the Continuous Effects of Assertive Feedback from a Job Interview Training Agent

Authors:

Tomoko Koda, Kota Yamauchi, Nao Takeuchi and Miho Hotta

Abstract: In this study, we developed an interview training agent system that identifies areas for improvement in interviewees’ nonverbal behaviours (eye gaze, facial expression, and posture) and verified its effectiveness in providing feedback using assertive communication in a series of experiments. Assertive communication is a method of conveying one’s opinions and sentiments while respecting another person's position and opinions. The effectiveness of the feedback was verified in two conditions: the assertive feedback condition, in which the agent provided feedback while expressing its sentiments, in addition to identifying areas for improvement and offering suggestions for improvement; and the control condition, in which the agent solely identified areas for improvement. The preliminal results showed that assertive feedback was effective in improving the acceptability and usefulness of the feedback and agents' interpersonal impressions. In addition, as a continuous effect of the three interview practices, the agent's interpersonal impression improved as the number of times the participants received assertive feedback increased.
Download

Paper Nr: 320
Title:

Towards Ubiquitous Mapping and Localization for Dynamic Indoor Environments

Authors:

Halim Djerroud, Nico Steyn, Olivier Rabreau, Patrick Bonnin and Abderraouf Benali

Abstract: We present UbiSLAM, an innovative solution for real-time mapping and localization in dynamic indoor environments. By deploying a network of fixed RGB-D cameras strategically throughout the workspace, UbiSLAM addresses limitations commonly encountered in traditional SLAM systems, such as sensitivity to environmental changes and reliance on mobile unit sensors. This fixed-sensor approach enables real-time, comprehensive mapping, enhancing the localization accuracy and responsiveness of robots operating within the environment. The centralized map generated by UbiSLAM is continuously updated, providing robots with an accurate global view, which improves navigation, minimizes collisions, and facilitates smoother human-robot interactions in shared spaces. Beyond its advantages, UbiSLAM faces challenges, particularly in ensuring complete spatial coverage and managing blind spots, which necessitate data integration from the robots them-selves. In this paper we discusse a potential solutions, such as automatic calibration for optimal camera placement and orientation, along with enhanced communication protocols for real-time data sharing. The proposed model reduces the computational load on individual robotic units, allowing less complex robotic platforms to operate effectively while enhancing the robustness of the overall system.
Download

Paper Nr: 331
Title:

Conceptual Approaches to Identify the Hazardous Scenarios in Safety Analysis for Automated Driving Systems

Authors:

Marzana Khatun, Florence Wagner, Rolf Jung and Michael Glass

Abstract: To ensure safety of the road users is one of the major challenges in highly automated driving. The technologies applied in semi or fully-automated vehicles that are safer than human drivers compromise functionalities and human comfort. A comprehensive understanding of the use of complex driving systems and the Operational Design Domain (ODD) is essential for the effective deployment and safe operation of Automated Driving Systems (ADSs). Hazard analysis is a foundation of various safety engineering methods, which include Functional Safety (FuSa) and Safety Of The Intended Functionality (SOTIF). The scenario-based analysis offers significant advantages in the safety analysis of automated vehicles but poses inherent difficulties in identifying unknown-hazardous scenarios. The work presented in this paper deals with the conceptual approaches of hazard scenario identification. Moreover, discusses the incorporation of Machine Learning (ML) in Hazard Analysis and Risk Assessment (HARA) for vehicles equipped with ADSs. Furthermore, this paper can serve as foundation support for research inquiries related to ADSs validation and safety assessment.
Download

Paper Nr: 335
Title:

Efficient Models Deep Reinforcement Learning for NetHack Strategies

Authors:

Yasuhiro Onuki, Yasuyuki Tahara, Akihiko Ohsuga and Yuichi Sei

Abstract: Deep reinforcement learning (DRL) has been widely used in agent research across various video games, demonstrating its effectiveness. Recently, there has been increasing interest in DRL research in complex environments such as Roguelike games. These games, while complex, offer fast execution speeds, making them useful as a testbeds for DRL agents. Among them, the game NetHack has gained of research attention. In this study, we aim to train a DRL agent for efficient learning with reduced training costs using the NetHack Learning Environment (NLE). We propose a method that incorporates a variational autoencoder (VAE). Additionally, since the rewards provided by the NLE are sparse, which complicates training, we also trained a DRL agent with additional rewards. As a result, although we expected that using the VAE would allow for more advantageous progress in the game, contrary to our expectations, it proves ineffective. Conversely, we find that the additional rewards are effective.
Download

Paper Nr: 344
Title:

DeepGen: A Deep Reinforcement Learning and Genetic Algorithm-Based Approach for Coverage in Unknown Environment

Authors:

Nirali Sanghvi, Rajdeep Niyogi and Ribhu Mondal

Abstract: In this paper, a novel approach to optimize waypoint placement and coverage in multi-agent systems in unknown environments using a combined Genetic Algorithm and Deep Reinforcement Learning has been proposed. Effective exploration and coverage are essential in various fields, such as surveillance, environmental monitoring, and precision agriculture, where agents must cover large and often unknown environments efficiently. The proposed method uses a Genetic Algorithm to identify optimal waypoint configurations that maximize coverage while minimizing overlap among waypoints, after which a deep reinforcement learning policy refines the agents’ coverage policy to adaptively navigate and explore new areas. Simulation results demonstrate that this GA-DDQN approach significantly improves both the effectiveness of coverage and computational efficiency compared to traditional single-strategy methods. This combined framework offers a robust solution for real-world applications requiring optimized, adaptive multi-agent exploration and coverage.
Download

Paper Nr: 402
Title:

Towards Developing Ethical Reasoners: Integrating Probabilistic Reasoning and Decision-Making for Complex AI Systems

Authors:

Nijesh Upreti, Jessica Ciupa and Vaishak Belle

Abstract: A computational ethics framework is essential for AI and autonomous systems operating in complex, real-world environments. Existing approaches often lack the adaptability needed to integrate ethical principles into dynamic and ambiguous contexts, limiting their effectiveness across diverse scenarios. To address these challenges, we outline the necessary ingredients for building a holistic, meta-level framework that combines intermediate representations, probabilistic reasoning, and knowledge representation. The specifications therein emphasize scalability, supporting ethical reasoning at both individual decision-making levels and within the collective dynamics of multi-agent systems. By integrating theoretical principles with contextual factors, it facilitates structured and context-aware decision-making, ensuring alignment with overarching ethical standards. We further explore proposed theorems outlining how ethical reasoners should operate, offering a foundation for practical implementation. These constructs aim to support the development of robust and ethically reliable AI systems capable of navigating the complexities of real-world moral decision-making scenarios.
Download

Paper Nr: 406
Title:

Enhancing Many-Objective Particle Swarm Optimization with Island Model for Agricultural Optimization

Authors:

Chnini Samia, Abadlia Houda, Smairi Nadia and Nasri Nejah

Abstract: With the growing complexity of agricultural systems and the need to optimize multiple conflicting objectives simultaneously, traditional optimization methods often struggle to find satisfactory solutions. In this work, we introduce a novel enhancement to the standard Multi Objectives Particle Swarm Optimization (MOPSO) algorithm that significantly improves its effectiveness in handling the diverse and dynamic objectives inherent in agricultural optimization problems. we propose an improvement to the MOPSO algorithm by introducing an islanding technique to promote exploration and exploitation of the many-objective search space. The improved MOPSO algorithm, called I-MOPSO guide the search towards optimal and diverse solutions by dividing the search space into islands and facilitating information exchange between them. We put I-MOPSO into practice and tested it using a series of common many objective optimization algorithms. According to Experimental results show that I-MOPSO is capable of finding high-quality solutions on a variety of test problems, often outperforming the standard MOPSO algorithm and NSGAIII.
Download

Paper Nr: 415
Title:

Strategy-Proofness and Non-Obvious Manipulability of Top-Trading-Cycles with Strategic Invitations

Authors:

Shinnosuke Hamasaki, Taiki Todo and Makoto Yokoo

Abstract: Diffusion mechanism design is one of the recent trends in the literature of mechanism design. Its purpose is to incentivize agents to diffuse the information about the mechanism to as many followers as possible, as well as reporting their preferences. This paper is the first attempt to consider diffusion mechanism design for two-sided matching from the perspective of non-obvious manipulability. We focus on the top-trading-cycles (TTC) mechanism for the many-to-one two-sided matching problem. We clarify the necessary and sufficient condition for the mechanism to satisfy strategy-proofness and non-obvious manipulability, respectively. We also propose a new TTC-based matching mechanism that violates strategy-proofness but satisfies non-obvious manipulability, which illustrates how we can handle strategic information diffusion in two-sided matching.
Download

Paper Nr: 418
Title:

Action-Based Intrinsic Reward Design for Cooperative Behavior Acquisition in Multi-Agent Reinforcement Learning

Authors:

Iori Takeuchi and Keiki Takadama

Abstract: In recent years, research has been conducted in multi-agent reinforcement learning that aims at efficient agent exploration in complex environments by using intrinsic rewards. However, such intrinsic rewards may inhibit the learning of behaviors necessary for acquiring cooperative behavior, and may not be able to solve the task of the environment. In this paper, we propose two types of internal reward designs to promote agents’ learning of cooperative behaviors in multi-agent reinforcement learning. One is to use the average of the values of the actions selected by all agents to promote the learning of actions necessary for cooperative behavior but difficult to increase in value. The other is to provide an individual intrinsic reward when the value of the action selected by each agent is lower than the average of the values of all the actions at the time, aiming to escape from the local solution. The results of the experiment with StarCraft II scenario 6h vs 8z showed that by adding the proposed intrinsic reward to the intrinsic reward that encourages agents to explore unexplored areas, cooperative behavior can be obtained in more cases than before.
Download

Paper Nr: 445
Title:

VITAMIN: A Compositional Framework for Model Checking of Multi-Agent Systems

Authors:

Angelo Ferrando and Vadim Malvone

Abstract: The verification of Multi-Agent Systems (MAS) poses a significant challenge. Various approaches and methodologies exist to address this challenge; however, tools that support them are not always readily available. Even when such tools are accessible, they tend to be hard-coded, lacking in compositionality, and challenging to use due to a steep learning curve. In this paper, we introduce a methodology designed for the formal verification of MAS in a modular and versatile manner, along with an initial prototype, that we named VITAMIN. Unlike existing verification methodologies and frameworks for MAS, VITAMIN is constructed for easy extension to accommodate various logics (for specifying the properties to verify) and models (for determining on what to verify such properties).
Download

Paper Nr: 455
Title:

Prediction-Based Selective Negotiation for Refining Multi-Agent Resource Allocation

Authors:

Madalina Croitoru, Cornelius Croitoru and Gowrishankar Ganesh

Abstract: This paper proposes a 2-stage framework for multi-agent resource allocation. Following a Borda-based allocation, machine learning predictions about agent preferences are used to selectively choose agent pairs to perform negotiations to swap resources. We show that this selective negotiation improves overall satisfaction towards the resource redistribution.
Download

Paper Nr: 464
Title:

Integrating Late Variable Binding with SP-MCTS for Efficient Plan Execution in BDI Agents

Authors:

Frantisek Vidensky, Frantisek Zboril and Petr Veigend

Abstract: This paper investigates the Late binding strategy as an enhancement to the SP-MCTS algorithm for intention selection and variable binding in BDI (Belief-Desire-Intention) agents. Unlike the Early binding strategy, which selects variable substitutions prematurely, Late binding defers these decisions until necessary, aggregating all substitutions for a plan into a single node. This approach reduces the search tree size and enhances adaptability in dynamic environments by maintaining flexibility during plan execution. We implemented the Late binding strategy within the FRAg system to validate our approach and conducted experiments in a static maze task environment. Experimental results demonstrate that the Late binding strategy consistently outperforms Early binding, achieving up to 150% higher rewards, particularly for the lowest parameter values of the SP-MCTS algorithm in resource-constrained scenarios. These results confirm that it is feasible to integrate Late binding into intention selection methods, opening opportunities to explore its use in approaches with lower computational demands than the SP-MCTS algorithm.
Download

Paper Nr: 498
Title:

Interview Bot: Can Agentic LLM’s Perform Ethnographic Interviews?

Authors:

Stine Lyngsø Beltoft, Peter Schneider-Kamp and Søren Tollestrup Askegaard

Abstract: Chatbots based on large language models present a scalable and consistent alternative to human interviewers for collecting qualitative data. In this paper, we introduce the agentic chatbot “Interview Bot”, designed to mimic human adaptability and empathy in an interview setting. We explore to what extent it can handle the nuances and open-ended nature of ethnographic interviews. Our findings indicate that chatbots can engage participants and collect meaningful data, but that they still sometimes fall short of fully replicating human-facilitated interviews. Not withstanding challenges with the current state of the art, in the medium term, LLM-based agents hold great potential for scaling qualitative research beyond the confines of geographical, cultural, and language boundaries.
Download