Accepted Papers – IJCAI 2026

IJCAI-ECAI 2026 Accepted Papers · Survey Track

Presentation format

Every accepted paper is presented in two formats: an oral talk (6 min talk + 2 min for Q&A) — which must be delivered in person in Bremen by one of the authors — and a poster (A0, free format) during a dedicated poster session.

#SV1

Session Aug 20 · 11:30–12:30 · Room 4

Poster Aug 20 · 16:30–18:00

A Survey on 3D Skeleton Based Person Re-Identification: Taxonomy, Advances, Challenges, and Interdisciplinary Prospects

Haocong Rao, Chunyan Miao

Person re-identification via 3D skeletons is an important emerging research area that attracts increasing attention within the pattern recognition community. With distinctive advantages across various application scenarios, numerous 3D skeleton based person re-identification (SRID) methods with diverse skeleton modeling and learning paradigms have been proposed in recent years. In this paper, we provide a comprehensive review and analysis of recent SRID advances. First of all, we define the SRID task and provide an overview of its origin and major advancements. Secondly, we formulate a systematic taxonomy that organizes existing methods into three categories centered on hand-crafted, sequence-based, and graph-based modeling. Then, we elaborate on the representative models along these three types with an illustration of foundational mechanisms. Meanwhile, we provide an overview of mainstream supervised, self-supervised, and unsupervised SRID learning paradigms and corresponding common methods. A thorough evaluation of state-of-the-art SRID methods is further conducted over various types of benchmarks and protocols to compare their effectiveness, efficiency, and key properties. Finally, we present the key challenges and prospects to advance future research, and highlight interdisciplinary applications of SRID with a case study. A curated collection of valuable resources is available at https://github.com/Kali-Hac/3D-SRID-Survey.

Computer VisionOtherComputer VisionRecognition (object detection, categorization)Multidisciplinary Topics and ApplicationsOtherMultidisciplinary Topics and ApplicationsSecurity and privacy
#SV5

Session Aug 20 · 15:00–16:30 · Room 7

Poster Aug 20 · 16:30–18:00

A Review on Test-Time Scaling for Agentic Large Language Models

Jiayu An, Zheng Chen, Yongcheng Jing, Dacheng Tao, Bo Li

The field of Large Language Models (LLMs) is moving beyond simply scaling the parameters during pre-training. Instead, researchers are focusing on Test-Time Scaling (TTS), which is a mechanism to allocate computational resources dynamically during the inference process. In the rapidly emerging field of Agentic LLMs, TTS has also shown great value in enabling agents to handle complex tasks that static models cannot handle. Despite all the substantial work in this field, it still lacks a comprehensive survey summarizing these advances. To the best of our knowledge, this paper presents the first systematic review of TTS tailored for Agentic LLMs. To this end, we propose a novel RAIE taxonomy along four scaling dimensions: (i) Reasoning optimizes the entire thought process through search algorithms and self-verification; (ii) Acting involves collecting information through external tools and environmental feedback; (iii) Interaction copes with complex tasks by forming various multi-agent networks; (iv) Evolution enables agents to continuously update their memory and strategies. Furthermore, we also introduce a task-oriented guideline for choosing the best TTS strategy. Finally, we sum up challenges and future directions to inspire further studies in this field. A curated list of papers and associated code will be made publicly available.

Agent-based and Multi-agent SystemsEngineering methods, platforms, languages and tools
#SV17

Session Aug 19 · 11:30–12:30 · Room 3

Poster Aug 19 · 16:30–18:00

Graph Rewiring in GNNs to Mitigate Over-Squashing and Over-Smoothing: A Survey

Hugo Attali, Nathalie Pernelle, Davide Buscaldi, Fragkiskos D. Malliaros

Graph Neural Networks are powerful models for learning from graph-structured data, yet their effectiveness is often limited by two critical challenges: over-squashing, where information from distant nodes is excessively compressed, and over-smoothing, where repeated propagation makes node representations indistinguishable. Both phenomena stem from the interaction between message passing and the input topology, ultimately degrading information flow and limiting the performance of GNNs. In this survey, we examine graph rewiring techniques, a class of methods designed to modify the graph topology to enhance information propagation in GNNs. We provide a comprehensive review of state-of-the-art rewiring approaches, delving into their theoretical underpinnings, practical implementations, and performance trade-offs.

Data MiningMining graphsData MiningNetworksMachine LearningRepresentation learning
#SV20

Session Aug 18 · 15:00–16:30 · Room 5

Poster Aug 18 · 16:30–18:00

When Vision Meets Graphs: A Survey on Graph Reasoning and Learning

Xinjian Zhao, Wei Pang, Zhixuan Yu, Xiangru Jian, Xiaozhuang Song, Yaoyao Xu, Zhongkai Xue, Dingshuo Chen, Shu Wu, Philip Torr, Tianshu Yu

Graphs are a fundamental data structure underlying many problems in the natural and social sciences. Over the past decade, Graph Neural Networks (GNNs) have dominated graph machine learning, supported by solid theoretical foundations. Yet scientists often understand graph structure through vision: chemists read molecular diagrams and social scientists inspect network visualizations. Despite decades of work on graph visualization, most graph learning pipelines still treat graphs purely as symbolic structures, rarely leveraging the visual form of graphs. We argue that this gap deserves renewed attention in the era of powerful vision and vision language models. This survey provides a first systematic overview of the emerging area we term vision meets graphs, which treats visual depictions of graphs as first-class inputs for reasoning and learning. We organize existing work into three threads. Vision for Graph Reasoning studies how models can use visual depictions of graphs to understand structure and carry out multi-step reasoning. Vision for Graph Learning explores how visual features can complement or augment graph encoders beyond known limitations of message passing. Scientific Graphs examines domains where standardized depiction conventions support both reasoning and learning. Our goal is to clarify what current methods can and cannot do, and to outline a path toward foundation models that perceive and reason about graphs as scientists do.

Computer VisionMultimodal learningComputer VisionVision, language and reasoningData MiningData visualizationData MiningNetworks
#SV22

Session Aug 19 · 10:00–11:00 · Room 4

Poster Aug 19 · 16:30–18:00

A Survey of Artificial Intelligence in Endoscopic Surgery Workflow: From Perception to Surgical Support

Juyan Ba, Hao Chen, Xiaohan Xing, Yi Wang

Endoscopic surgery demands continuous real-time visual decision-making under severe constraints, including a limited field of view, motion blur, and dynamically deforming anatomy. These factors impose substantial cognitive load on surgeons and motivate the integration of artificial intelligence (AI) throughout the endoscopic surgical workflow. This survey reviews recent progress in AI for endoscopic surgery and organizes the literature into four stages that span perception to action: (1) image enhancement and analysis methods that improve visual perception; (2) multimodal video understanding approaches that model and reason surgical instruments and anatomical structures over space and time; (3) 3D reconstruction techniques that enable robust tracking and interpretation of deformable anatomy; and (4) emerging paradigms of embodied surgical intelligence, where action-conditioned world models link perception to intraoperative assistance.
Across these stages, we summarize current capabilities and limitations and identify key open challenges for clinical deployment. In addition, we provide an overview of 18 publicly available datasets, highlighting their scope and annotations. We hope this survey will stimulate further research toward reliable and clinically deployable AI systems for endoscopic surgery.

SVComputer VisionSVMultidisciplinary Topics and Applications
#SV28

Session Aug 18 · 11:30–12:30 · Room 10

Poster Aug 18 · 16:30–18:00

Towards Automated Kernel Generation in the Era of LLMs

Yang Yu, Peiyu Zang, Chi Hsu Tsai, Haiming Wu, Yixin Shen, Jialing Zhang, Haoyu Wang, Zhiyou Xiao, Jingze Shi, Yuyu Luo, Wentao Zhang, Chunlei Men, Guang Liu, Yonghua Lin

The performance of modern AI systems is fundamentally constrained by the quality of their underlying GPU kernels, which translate high-level algorithmic semantics into low-level hardware operations. Achieving near-optimal kernels requires expert-level understanding of hardware architectures and programming models, making kernel engineering a critical but notoriously time-consuming and non-scalable process. Recent advances in large language models and LLM-based agents have opened new possibilities for automating kernel generation and optimization. LLMs are well-suited to compress expert-level kernel knowledge that is difficult to formalize, while agentic systems further enable scalable optimization by casting kernel development as an iterative, feedback-driven loop. Rapid progress has been made in this area. However, the field remains fragmented and lacks a systematic perspective for LLM-driven kernel generation. This survey addresses this gap by providing a structured overview of existing approaches, spanning LLM-based approaches and agentic optimization workflows, and systematically organizing the datasets and benchmarks that underpin learning and evaluation in this domain. Moreover, key open challenges and future research directions are further outlined, aiming to establish a comprehensive reference for the next generation of automated kernel optimization. To keep track of this field, we maintain an open-source GitHub repository at https://github.com/flagos-ai/awesome-LLM-driven-kernel-generation.

Multidisciplinary Topics and ApplicationsAI hardwareMultidisciplinary Topics and ApplicationsSoftware engineeringNatural Language ProcessingApplicationsAgent-based and Multi-agent SystemsApplications
#SV33

Session Aug 19 · 15:00–16:30 · Room 10

Poster Aug 19 · 16:30–18:00

Towards Vision-Spatiotemporal Fusion in Traffic Forecasting: A Survey on Cross-Modal Alignment

Anna Wang, Chao Zhang, Mingwei Lin, Junbo Zhang, Zeshui Xu, Wentao Li, Pengfei Zhang, Oscar Castillo

Traffic forecasting is evolving, with world models emerging as a powerful framework applicable to tasks such as core state, trajectory, event, and demand forecasting. These tasks involve both visual and spatiotemporal data, yet most existing methods treat them separately, hindering a unified understanding of traffic scenes in both semantic meanings and spatiotemporal dynamics. The fusion of the two modalities is critical for building models that comprehend complex traffic scenarios. However, the fusion issue faces two fundamental misalignments: semantic, where pixels conflict with traffic concepts, and geometric, which requires spatial intelligence to map 2D inputs into 3D. This survey reframes vision-spatiotemporal fusion via the unique lens of cross-modal alignment, addressing semantic and geometric failures that limit forecasting reliability. First, we categorize existing methods into three paradigms: feature-level, semantic-level, and task-level. This reveals their progression from low-level feature manipulation to high-level architectural integration. Second, we synthesize representative techniques per paradigm, highlighting geometric challenges such as cross-view association and spatial mapping. Third, we examine current datasets and benchmarks, highlighting their deficiencies in evaluating alignment. Finally, we outline future directions, including spatiotemporal intelligence for robust perception and holistic traffic world models. The unified framework establishes a reference for robust and explainable forecasting systems.

Data MiningMining spatial and/or temporal data
#SV36

Session Aug 20 · 11:30–12:30 · Room 7

Poster Aug 20 · 16:30–18:00

From Multimodal Perception to Strategic Reasoning: A Survey on AI-Generated Game Commentary

Qirui Zheng, Xingbo Wang, Keyuan Cheng, Yunlong Lu, Muhammad Asif Ali, Lingfeng Li, Yongyi Wang, Wenxin Li

The advent of artificial intelligence has propelled AI-Generated Game Commentary (AI-GGC) into a rapidly expanding research area, offering advantages such as scalable availability and personalized narration. However, existing studies remain fragmented, and a systematic survey that unifies prior efforts is still lacking. To bridge this gap, our survey introduces a unified framework that systematically organizes the AI-GGC landscape. We present a novel taxonomy focused on three core commentator capabilities: Live Observation, Strategic Analysis, and Historical Recall, and further categorize commentary into three corresponding types: Descriptive Commentary, Analytical Commentary, and Background Commentary. Building on this structure, we provide an in-depth review of methods, datasets, and evaluation metrics, analyzing their strengths and limitations. Finally, we highlight key challenges and point out promising directions for future research in AI-GGC.

Natural Language ProcessingLanguage generationNatural Language ProcessingResources and evaluationMachine LearningMulti-modal learningMultidisciplinary Topics and ApplicationsEntertainment
#SV40

Session Aug 20 · 11:30–12:30 · Room 10

Poster Aug 20 · 16:30–18:00

From Human Videos to Robot Manipulation: A Survey on Scalable Vision-Language-Action Learning with Human-Centric Data

Zhiyuan Feng, Qixiu Li, Huizhi Liang, Rushuai Yang, Yichao Shen, Zhiying Du, Zhaowei Zhang, Yu Deng, Li Zhao, Hao Zhao, Zongqing Lu, Oier Mees, Marc Pollefeys, Jiaolong Yang, Baining Guo

Recent progress in generalizable embodied control has been driven by large-scale pretraining of Vision–Language–Action (VLA) models. However, most existing approaches rely on large collections of robot demonstrations, which are costly to obtain and tightly coupled to specific embodiments. Human videos, by contrast, are abundant and capture rich interactions, providing diverse semantic and physical cues for real-world manipulation. Yet, embodiment differences and the frequent absence of task-aligned annotations make their direct use in VLA models challenging. This survey provides a unified view of how human videos are transformed into effective knowledge for VLA models. We categorize existing approaches into four classes based on the action-related information they derive: (i) latent action representations that encode inter-frame changes; (ii) predictive world models that forecast future frames; (iii) explicit 2D supervision that extracts image-plane cues; and (iv) explicit 3D reconstruction that recovers geometry or motion. Beyond this taxonomy, we highlight three key open challenges in this area: structuring unstructured videos into training-ready episodes, grounding video-derived supervision into robot-executable actions under embodiment and viewpoint heterogeneity, and designing evaluation protocols that better predict real-world deployment performance and transfer efficiency, thereby informing future research directions.

RoboticsLearning in roboticsRoboticsManipulationRoboticsRobotics and vision
#SV41

Session Aug 18 · 11:30–12:30 · Room 9

Poster Aug 18 · 16:30–18:00

Concept Bottleneck Models for Explainable Decision Making: A Survey of Progress, Taxonomy, and Future Directions

Chunjiang Wang, Fan Li, Wenbo Hu, Rui Yan, Kun Zhang, Shaohua Kevin Zhou

Deep neural networks deliver strong performance but remain opaque, limiting their use in high-stakes domains that require transparency and human oversight. Concept Bottleneck Models (CBMs) address this gap by introducing a human-interpretable concept layer that mediates inputs and decisions, enabling semantic explanations and test-time intervention. This survey provides a unified review of CBMs organized along four dimensions: concept acquisition, concept-based decision making, concept intervention, and concept evaluation. We summarize the evolution of concept construction from manual annotation to lexicon-based mining, LLM/VLM-guided generation, and visually grounded discovery via prototypes and diffusion models; review emerging CBM architectures beyond strict bottlenecks; and consolidate evaluation and intervention protocols emphasizing faithfulness, sparsity, and intervenability, with particular relevance to high-stakes domains such as healthcare. We synthesize fragmented literature and outline key challenges and future directions for concept-based interpretable decision making.

AI Ethics, Trust, FairnesExplainability and interpretability
#SV74

Session Aug 21 · 10:00–11:00 · Room 8

Poster Aug 21 · 15:00–16:00

Constraining Generative Models: A Survey from the Constraint Programming Perspective

Alexandre Bonlarron, François Pachet, Pierre Roy, Jean-Charles Régin

Generative models produce long and high probability sequences, yet they often fail to satisfy explicit constraints set by users. Over the past two decades, Constraint Programming (CP) has provided a complementary paradigm: combining generative models with a constraint solver to guarantee feasibility. This survey reviews the main concepts behind these CP-driven hybrid approaches, from enforcing ubiquitous structural rules (e.g., length and patterns) to preventing plagiarism. It synthesizes how learned models can be treated as constraints, compiled structures, or probabilistic factors. We highlight what has remained stable across applications, then discuss how these principles transfer to the Large Language Model era and outline open challenges for controllable and trustworthy generative systems.

Constraint Satisfaction and OptimizationConstraint programmingMachine LearningGenerative modelsSearchCombinatorial search and optimisationMultidisciplinary Topics and ApplicationsArts and creativity
#SV77

Session Aug 20 · 15:00–16:30 · Room 10

Poster Aug 20 · 16:30–18:00

A Survey of Personalized Federated Foundation Models for Privacy-Preserving Recommendation

Zhiwei Li, Guodong Long, Chunxu Zhang, Honglei Zhang, Chengqi Zhang, Jing Jiang

Integrating Foundation Models (FMs) into recommendation systems is an emerging and promising research direction. However, centralized paradigms face growing pressure from privacy concerns and strict regulatory requirements. Federated learning offers a viable solution that enables collaborative model refinement while keeping raw user data on local devices or organizational silos. Yet, applying FMs in this setting creates a fundamental tension, where the system must balance the leverage of global knowledge with the necessity of capturing user personality. This survey provides a comprehensive overview of Personalized Federated Foundation Models for privacy-preserving recommendation, and review recent progress in this emerging field. We first analyze personalization techniques that function effectively under federated settings. Furthermore, we discuss the adaptation of foundation models to such federated architectures to balance generalization with user-specific needs for achieving privacy-preserving recommendation. In contrast to existing reviews, our work specifically emphasizes the architectural intersection of federation, personalization, and foundation models.

Data MiningRecommender systemsMachine LearningFederated learningMachine LearningFoundation modelsData MiningPrivacy-preserving data mining
#SV80

Session Aug 21 · 10:00–11:00 · Room 2

Poster Aug 21 · 15:00–16:00

Tackling Multimodal Learning Challenges with Mixture-of-Expert: A Survey

Liangwei Zheng, Wei Emma Zhang, Olaf Maennel, Lin Yue, Weitong Chen

Mixture-of-Experts (MoE) presents a naturally compatible and scalable framework for multimodal learning, demonstrating strong adaptability across diverse modalities and tasks. Despite its growing success, a comprehensive and systematic evaluation of multimodal MoE remains lacking. Existing surveys tend to address either multimodal learning or MoE independently, overlooking the unique interplay between them. This survey fills that gap by addressing a central question: How does MoE effectively resolve multimodal challenges? We approach this from three key perspectives: (1) MoE as an Efficient Multimodal Framework: enabling scalable multimodal modeling by decoupling computational cost from parameter growth and mitigating modality redundancy through selective expert activation; (2) MoE as a Multimodal Representation Learner: integrating complementary multi-opinion expert knowledge to enrich alignment and interaction representations; and (3)MoE as a Multimodal Adapter: providing a modular and flexible mechanism to model imperfect modality data such as modality imbalance and missing modality. Through an extensive literature review, we identify critical research gaps, including interpretable routing, expert communication, modality integration, and lifelong multimodal learning. We position this survey as a foundation for future research toward interpretable, adaptive, and sustainable multimodal Mixture-of-Experts systems.

Machine LearningMulti-modal learningData MiningInformation retrievalData MiningMining heterogenous dataAI Ethics, Trust, FairnesTrustworthy AI
#SV82

Session Aug 21 · 10:00–11:00 · Room 5

Poster Aug 21 · 15:00–16:00

AI-Enhanced Vein Biometrics: A Comprehensive Survey

Yifan Wang, Jie Gui, Changsheng Chen, Alex Kot

Vein biometrics has emerged as a promising biometric modality for personal identity authentication, benefiting from its intrinsic properties such as high discriminative capability, resistance to forgery, and contactless acquisition. Recent advances in artificial intelligence, particularly deep learning, have significantly accelerated its development. This paper presents a comprehensive and systematic survey of AI-enhanced vein biometrics. We review fundamental principles, publicly available datasets, and evaluation protocols, and systematically analyze existing methods across the entire vein biometric pipeline, including acquisition, preprocessing, feature extraction, recognition and verification, security and privacy protection, and multimodal fusion. Furthermore, we summarize representative application scenarios, identify key challenges, and highlight promising directions for future research. To facilitate reproducible research and long-term development of the field, we release an open, evolving research resource Awesome-Vein-Biometrics that systematically summarizes and tracks recent advances in vein biometrics.

Computer VisionBiometrics, face, gesture and pose recognition
#SV87

Session Aug 20 · 11:30–12:30 · Room 2

Poster Aug 20 · 16:30–18:00

A Comprehensive Survey of Deep Learning for Multivariate Time Series Forecasting: A Channel Strategy Perspective

Xiangfei Qiu, Hanyin Cheng, Xingjian Wu, Junkai Lu, Jilin Hu, Chenjuan Guo, Christian S. Jensen, Bin Yang

Multivariate Time Series Forecasting (MTSF) plays a crucial role across diverse fields, ranging from economics to energy to traffic. In recent years, deep learning has demonstrated outstanding performance in MTSF tasks. In MTSF, modeling the correlations among different channels is critical, as leveraging information from other related channels can significantly improve the prediction accuracy of a specific channel. This study systematically reviews the channel modeling strategies for time series and proposes a taxonomy organized into three hierarchical levels: the strategy perspective, the mechanism perspective, and the characteristic perspective. On this basis, we provide a structured analysis of these methods and conduct an in-depth examination of the advantages and limitations of different channel strategies. Finally, we summarize and discuss some future research directions to provide useful research guidance. Moreover, we maintain an up-to-date GitHub repository which includes all the papers discussed in the survey.

Data MiningMining spatial and/or temporal dataMachine LearningTime series and data streams
#SV92

Session Aug 20 · 15:00–16:30 · Room 4

Poster Aug 20 · 16:30–18:00

Large Language Models for Blockchain Security and Analytics: A Survey

Collette Eguakun Okundia, Cuneyt Akcora, Arijit Khan

Large Language Models (LLMs) are transforming blockchain security and analytics, yet a systematic evaluation of their capabilities remains limited. This survey provides a comprehensive, AI‑centric assessment of LLM‑based methods across over 70 recent studies spanning 11 application domains, such as security auditing, transaction fraud detection, and cryptocurrency portfolio management. Our unified taxonomy standardizes task formulations and evaluation practices to enable a comparison of six LLM roles across domains. For each domain, we review input representations tailored to blockchain data; LLM architectures, learning and inference paradigms, e.g., fine‑tuning, retrieval‑augmented generation, and agentic strategies. Our review analyzes the strengths, limitations, and emerging patterns of LLM roles observed in current systems. Finally, we provide practical guidance for selecting LLMs for specific roles and outline promising research directions.

Data MiningApplicationsData MiningMining graphsMachine LearningSupervised Learning
#SV96

Session Aug 19 · 15:00–16:30 · Room 7

Poster Aug 19 · 16:30–18:00

Dynamic Heterogeneous Graph Representation Learning: A Survey

Huan Liu, Pengfei Jiao, Jie Yin, Hongjiang Chen, Zhidong Zhao

Graph representation learning (GRL) serves as a canonical paradigm for modeling complex networks. However, real-world AI systems inherently manifest as evolving heterogeneous entities with complex interactions, posing significant challenges to static or homogeneous modeling. To address these complexities, representation learning for Dynamic Heterogeneous Graphs (DHGs) has emerged as a vital approach for learning low-dimensional representations that simultaneously preserve structural semantics and temporal dynamics. This survey presents the first systematic review of DHG representation learning methods. We first introduce a unified formal definition that encompasses both discrete-time and continuous-time DHGs from the perspective of temporal granularity. Building upon this formulation, we propose a novel algorithm-centric taxonomy that categorizes existing literature, including early embedding-based approaches, graph neural network (GNN)-based models, and relatively recent Transformer-based DHG methods, while explicitly highlighting their intrinsic modeling biases with respect to dynamic granularity. Furthermore, we summarize representative applications of DHG representation learning, along with commonly used datasets and benchmarks. Finally, we discuss promising research directions that guide future advances in this rapidly evolving field.

Data MiningMining graphsMachine LearningRepresentation learningMachine LearningSelf-supervised LearningMachine LearningSequence and graph learning
#SV108

Session Aug 18 · 11:30–12:30 · Room 7

Poster Aug 18 · 16:30–18:00

A Survey of Joint Online-Offline Fine-tuning for Large Language Models

Taihang Zhen, Guang Yang, Chenzhang Li, Nuo Yan, Shilong Zhou, Guangyu Liu, Xiaotong Tang, Jing Huo, Boyan Wang, Junlan Feng, Yuyao Zhang

Post-training for Large Language Models (LLMs) can be mainly categorized into offline Supervised Fine-Tuning (SFT) for knowledge acquisition and online Reinforcement Fine-Tuning (RFT) for adaptive refinement. Current state-of-the-art approaches typically employ a sequential cold-start pipeline (SFT-Then-RFT). However, we argue that this disjoint transition imposes an “alignment tax", leading to catastrophic forgetting and reward hacking during the unregularized exploration phase. In this work, we advocate for Joint Online-Offline Fine-Tuning as a superior paradigm that breaks the convention of restricting offline data to SFT and online data to RFT. By integrating full offline response generation with online rollouts—particularly within the realm of Reinforcement Learning with Verifiable Rewards (RLVR)—this approach mitigates the limitations of isolated training phases. We provide the first comprehensive survey focusing specifically on the synchronization of data provenance. We introduce a novel taxonomy for related works, analyze their theoretical advantages in balancing stability with plasticity, and outline a roadmap for next-generation post-training frameworks.

Natural Language ProcessingLanguage modelsMachine LearningSupervised LearningMachine LearningReinforcement learning
#SV112

Session Aug 18 · 11:30–12:30 · Room 5

Poster Aug 18 · 16:30–18:00

Multimodal Emotion Recognition with Large Language Models

Hongrui Zhang, Daiqing Wu, Yangyang Li, Kuien Liu, Yuhui Wang, Yu Zhou, Sicheng Zhao

Multimodal Emotion Recognition (MER) focuses on identifying and interpreting emotions from modality-compound inputs. Closely mirroring human cognitive processes in real-world environments, MER has drawn substantial attention from both academia and industry. Recently, a paradigm shift has been unveiled in MER, from leveraging small-scale, task-specific models to Large Language Models (LLMs). We refer to the latter as the MER-with-LLMs paradigm, which offers unprecedented generality, spurring numerous empirical attempts, even alongside speculation about their potential to achieve general emotional intelligence. However, with these new opportunities come new challenges, including the scarcity of emotionally annotated data, the affective gap both within and across modalities, and the opacity of affective interpretation. To systematically review existing research and guide future exploration, this paper categorizes prior works according to their focus on addressing these challenges into three directions: Affective Data Augmentation, Multimodal Affective Representation, and Multimodal Affective Reasoning. By thoroughly tracing the development, emerging trends, and remaining issues within each direction, this paper aims to provide a clear academic map of the MER-with-LLMs paradigm and foster its structured advancement.

Computer VisionInterpretability and transparencyComputer VisionMultimodal learningComputer VisionVideo analysis and understandingNatural Language ProcessingLanguage generationNatural Language ProcessingLanguage models
#SV134

Session Aug 19 · 15:00–16:30 · Room 7

Poster Aug 19 · 16:30–18:00

From Time Series Analysis to Question Answering: A Survey in the LLM Era

Wei Li, Zhe Xie, Yuxuan Liang, Xinli Hao, Yunyao Cheng, Dan Pei, Xiaofeng Meng

Recently, Large Language Models (LLMs) have introduced a novel paradigm in Time Series Analysis (TSA), leveraging strong language capabilities to support tasks such as forecasting and anomaly detection. However, these analysis tasks cannot adequately cover temporal language tasks, such as interpretation and captioning. A fundamental gap remains between TSA and LLMs: LLMs are pre-trained to optimize natural language relevance for question answering rather than objectives specialized for TSA. To bridge this gap, TSA is evolving toward Time Series Question Answering (TSQA), shifting from expert-driven and task-specific analysis to user-driven and task-unified question answering. TSQA depends on flexible exploration rather than predefined TSA pipelines. In this survey, we first propose a taxonomy that reflects the evolution from TSA to TSQA, driven by a shift from external to internal alignment. We then organize existing literature into three alignment paradigms: Injective Alignment, Bridging Alignment, and Internal Alignment, and provide practical guidance for flexible, economical, and generalizable selection of alignment paradigms. We finally analyze datasets across domains and characteristics, identify challenges, and highlight future research directions.

Data MiningMining spatial and/or temporal dataNatural Language ProcessingQuestion answering
#SV137

Session Aug 19 · 15:00–16:30 · Room 8

Poster Aug 19 · 16:30–18:00

Modeling Liquid Democracy: A Survey of the (Computational) Social Choice Literature

Davide Grossi, Andreas Nitsche, Georgios Papasotiropoulos

Liquid democracy encompasses a family of decision-making processes where votes can be cast directly or passed along proxy chains. We provide a community-maintainable and systematic survey of (computational) social choice papers on liquid democracy, organized through a searchable taxonomy of core modeling features that have appeared in the literature. Drawing on the insights from our survey, we also outline a number of research directions, which we consider of special importance for both the theory and practice of liquid democracy.

Game Theory and Economic ParadigmsComputational social choice
#SV140

Session Aug 18 · 15:00–16:30 · Room 10

Poster Aug 18 · 16:30–18:00

A Survey on the Verification of Reinforcement Learning Policies

Luca Marzari, Ezio Bartocci, Enrico Marchesini

Reinforcement learning (RL) is increasingly applied in complex, safety-critical domains, yet the lack of rigorous behavioral guarantees for neural network-based policies remains a major barrier to deployment. Recent advances in policy expressiveness and scale have intensified this challenge, leading to a rapidly growing but conceptually fragmented body of work on RL policy verification. This survey provides a unifying perspective on RL verification methods. We introduce a taxonomy that clarifies relationships among existing approaches along three axes: verification paradigm (formal versus probabilistic), temporal scope (step-wise versus multi-step), and guarantees strength. Beyond taxonomy, we unify underlying theoretical foundations, make implicit assumptions and limitations explicit, and identify emerging directions.

Agent-based and Multi-agent SystemsFormal verification, validation and synthesisMachine LearningReinforcement learning
#SV141

Session Aug 20 · 15:00–16:30 · Room 6

Poster Aug 20 · 16:30–18:00

Approximation Algorithms for the Shapley Value: Taxonomy and Properties

Patrick Kolpaczki, Eyke Hüllermeier

Attributing importance to the individual components of a larger unit has become a popular method for understanding models and data in AI and machine learning. Starting with feature explanation, this method is now also used in data valuation or federated learning, just to name a few. Despite their differences, all of these applications use the same mathematical attribution mechanism: the Shapley value, which is rooted in cooperative game theory. While the Shapley value is appealing and has strong axiomatic foundations, it is computationally intractable due to the combinatorial explosion of player subsets. Therefore, there is a need for approximation algorithms, which have been studied intensively in recent years. This survey provides an overview of general-purpose approximation methods applicable to any domain. We categorize these methods into algorithmic classes, compare their properties, and highlight connections between approaches in a comprehensive taxonomy.

Game Theory and Economic ParadigmsCooperative gamesMachine LearningGame Theory
#SV144

Session Aug 19 · 11:30–12:30 · Room 8

Poster Aug 19 · 16:30–18:00

Spatial Pattern Matching: A Survey

Nicole R. Schneider, Kent O'Sullivan, Hanan Samet

Recent developments in Artificial Intelligence (AI) have led to new ways for users to search through vast information. However, users may have questions that are grounded in the real world that require spatial inference, for which language models are not well suited. Conversely, traditional spatial search methods, like spatial pattern matching, answer spatial reasoning questions correctly but are noise-intolerant, slow, and brittle. Presently, there are opportunities to integrate AI and spatial pattern matching to enable robust and flexible spatial search. This paper surveys existing spatial pattern matching methods, including the few that apply machine learning to the problem, discussing their efficiency and limitations, and describing opportunities to further enable spatial search through AI.

Knowledge Representation and ReasoningQualitative, geometric, spatial, and temporal reasoningData MiningMining spatial and/or temporal dataData MiningInformation retrievalData MiningKnowledge graphs and knowledge base completion
#SV146

Session Aug 20 · 10:00–11:00 · Room 6

Poster Aug 20 · 16:30–18:00

A Survey on Value Alignment in Agentic AI Systems

Wei Zeng, Hengshu Zhu, Chuan Qin, Han Wu, Yihang Cheng, Yinuo Shen, Zhe Wang, Yuyang Wang, Sirui Zhang, Xiaowei Jin, Zhenxing Wang, Feimin Zhong, Hui Xiong

With the evolution of artificial intelligence (AI) paradigms towards agentic AI, the widespread integration of large language models (LLMs) enhances system capabilities while also introducing situational risks and challenges of value misalignment, making value alignment in agentic AI systems a critical issue. This paper constructs a multi-level value framework encompassing L0 (universal values), L1 (cultural and industry values), and L2 (context-specific values). Guided by this framework, we conduct an in-depth analysis along the technical stack: at the LLM level, we examine value injection mechanisms through pretraining and post-training; at the single-agent level, we focus on representation and injecting values to agents, Profiles and memory, and planning and action; at the multi-agent level, we summarize collaborative alignment methods such as communication strategy optimization and multi-objective reinforcement learning. Following a systematic review of existing datasets and methods for multi-level alignment evaluation, we outline future research directions, including inter-agent value coordination mechanisms, high-quality scenario data sharing, game-theoretic design for value alignment in agent interaction and communication protocol alignment—aiming to establish a more systematic and dynamic evaluation framework and to promote robust and trustworthy value consensus in agentic AI systems within social collaboration.

Agent-based and Multi-agent SystemsOtherAI Ethics, Trust, FairnesValues
#SV153

Session Aug 20 · 10:00–11:00 · Room 2

Poster Aug 20 · 16:30–18:00

Adaptive Reward Design in Reinforcement Learning: A Taxonomy and Survey

Raphaela Baybas, Carlo D'Eramo, Philipp Brune

Adaptive Reward Design (ARD) is becoming a fundamental component for Reinforcement Learning (RL) agents, as they are deployed in increasingly complex settings where a single static reward across all phases of learning is rarely sufficient. Yet, ARD is rarely studied as a coherent topic: Relevant ideas are dispersed across reward shaping, curriculum learning, intrinsic motivation, non-stationary objectives, and preference- or feedback-based learning, which obscures conceptual connections and complicates method selection. This survey provides a unified view of ARD in RL by introducing a taxonomy, organized by the primary driver of the reward variation. The taxonomy distinguishes external-feedback-driven reward updates from reward adaptations driven by endogenous within-run signals and those conditioned on exogenous context signals. Using explicit assignment rules, we place work published between 2010 and 2025 within this taxonomy. By synthesizing typical RL settings and domains at the driver level, we simplify the method selection in ARD. Further, we describe the evolution and current trends of ARD and conclude by outlining promising future research directions.

Machine LearningReinforcement learning
#SV155

Session Aug 19 · 11:30–12:30 · Room 6

Poster Aug 19 · 16:30–18:00

Beyond Scaling: A Survey of Data-Efficient Learning for LLM Agents

Yaqing Wang, Zhenlin Luo, Peiyao Zhao, Yunfeng Cai, Quanming Yao

LLM-based agents perceive, reason, act, and adapt through interaction. While scaling remains important, agentic progress also depends on extracting more learning signal from limited experience, including demonstrations, feedback, reasoning traces, tool-use records, and interaction trajectories. This survey develops an agent-centric view of data-efficient learning. We formulate an agentic learning loop, define data efficiency as capability improvement without proportional increases in supervision or real-environment trial-and-error, and synthesize methods into experience augmentation, agent structural design, and learning paradigms. We also summarize representative application domains and open challenges in experience reuse and cost-aware evaluation. Overall, the agentic era requires smarter ways to acquire, structure, reuse, and learn from limited experience.

Agent-based and Multi-agent SystemsApplicationsMachine LearningFew-shot learningMachine LearningLearnware/model reuse/transfer learning
#SV158

Session Aug 18 · 15:00–16:30 · Room 7

Poster Aug 18 · 16:30–18:00

LLM-Based Intelligent Tutoring Systems: A Survey

Li Kong, Jianwen Sun, Junsheng Zhou, Vincent Ng

Large Language Models (LLMs) are reshaping the design and capabilities of intelligent tutoring systems (ITS) by providing powerful generative, reasoning and interaction abilities, which surpass traditional rule-based approaches. This survey presents a structured overview of LLM-based ITS and analyzes how these models transform classical system components and architectures. We first review the foundational concepts of traditional ITS and introduce the functional roles of the main components, followed by LLM-based techniques and related datasets for realizing each of these components. Furthermore, we examine the key application domains and concludes the survey by outlining future research directions.

Natural Language ProcessingApplications
#SV160

Session Aug 19 · 11:30–12:30 · Room 9

Poster Aug 19 · 16:30–18:00

An XAI View on Explainable ASP: Methods, Systems, and Perspectives

Thomas Eiter, Tobias Geibinger, Zeynep G. Saribatur

Answer Set Programming (ASP) is a popular declarative reasoning and problem solving approach in symbolic AI.
Its rule-based formalism makes it inherently attractive for explainable and interpretive reasoning, which is gaining importance with the surge of Explainable AI (XAI). A number of explanation approaches and tools for ASP have been developed, which often tackle specific explanatory settings and may not cover all scenarios that ASP users encounter.
In this survey, we provide, guided by an XAI perspective, an overview of types of ASP explanations in connection with user questions for explanation, and describe their coverage by current theory and tools.
Furthermore, we pinpoint gaps in existing ASP explanations approaches and identify research directions for future work.

AI Ethics, Trust, FairnesExplainability and interpretabilityKnowledge Representation and ReasoningLogic programmingKnowledge Representation and ReasoningNon-monotonic reasoning
#SV161

Session Aug 19 · 10:00–11:00 · Room 7

Poster Aug 19 · 16:30–18:00

Accelerating Masked Diffusion Large Language Models: A Survey of Efficient Inference Techniques

Daehoon Gwak, Minhyung Lee, Junwoo Park, Jaegul Choo

Diffusion large language models (dLLMs) offer a theoretical advantage in parallel generation over standard autoregressive models. However, parallel generation alone does not guarantee practical speedups. Realizing this efficiency requires specialized inference mechanisms, such as diffusion-aware caching and reuse. Consequently, as inference efficiency becomes a prerequisite for practical deployment, recent research has actively explored acceleration techniques across algorithms, architectures, and systems. However, rigorous comparisons remain difficult, as end-to-end latency stems from intricate trade-offs between algorithmic, architectural, and system-level factors that are often conflated in existing benchmarks. In this survey, we introduce a unified latency decomposition framework for dLLMs to disentangle these factors and analyze their impact on inference speed in real deployments. Guided by this framework, we categorize acceleration techniques along three axes covering algorithmic innovations, architectural and system optimizations, and inference-time scaling. Finally, we provide guidelines for reproducible benchmarking and highlight open challenges for realizing the full potential of parallel generation.

Natural Language ProcessingLanguage modelsNatural Language ProcessingResources and evaluation
#SV181

Session Aug 20 · 15:00–16:30 · Room 3

Poster Aug 20 · 16:30–18:00

Test-Time Adaptation for Graph Learning: A Systematic Survey

Jiayi Chen, Xin Zheng, Bo Li, Zeyu Wang, Yanqing Guo, Feng Xia

Graph distribution shifts between training and test graphs pose severe challenges to the generalization of graph neural networks (GNNs). In real-world deployment, application environments are continuously evolving, while retraining or redesigning GNNs is often costly and impractical. In light of this, test-time adaptation on graphs, which aims to dynamically adapt well-trained GNNs or adjust test graphs to improve inference performance, has attracted growing attention as a practical solution. In this survey, we provide a comprehensive review of test-time adaptation on graphs, an emerging yet underexplored research direction. We identify two fundamental challenges: (1) Data-level: complex graph distribution shifts; and (2) Model-level: limited test-time learning information. Upon this, we present a systematic taxonomy of existing methods into (a) model-centric, (b) data-centric, and (c) hybrid methods, followed by a summary of representative applications, benchmarks, and open opportunities. We aim to bridge the gap between laboratory GNN development and real-world deployment via test-time adaptation. The repository of papers, code, and datasets is at https://github.com/jiayichen1121/Test-Time-Adaptation-for-Graph-Learning.

Data MiningMining graphsData MiningNetworks
#SV185

Session Aug 19 · 15:00–16:30 · Room 3

Poster Aug 19 · 16:30–18:00

Learning PDE Solvers with Physics and Data: A Unifying View of Physics-Informed Neural Networks and Neural Operators

Yilong Dai, Shengyu Chen, Ziyi Wang, Xiaowei Jia, Yiqun Xie, Vipin Kumar, Runlong Yu

Partial differential equations (PDEs) are central to scientific modeling. Modern workflows increasingly rely on machine learning-based components. Despite the emergence of various physics-aware data-driven approaches, the field still lacks a unified perspective to uncover their relationships, limitations, and appropriate roles in scientific workflows. To this end, we propose a unifying perspective that places two dominant paradigms, Physics-Informed Neural Networks (PINNs) and Neural Operators (NOs), within a shared design space. We organize existing methods from three fundamental dimensions: what is learned, how physical structures are integrated into the learning process, and how the computational load is amortized across problem instances. In this way, many practical challenges can be best understood as consequences of these structural properties of learning PDEs. By analyzing recent advances through this unifying view, our survey aims to facilitate the development of reliable learning-based PDE solvers and help catalyze a deeper synthesis of physics and data.

Machine LearningKnowledge-aided learning
#SV195

Session Aug 20 · 15:00–16:30 · Room 7

Poster Aug 20 · 16:30–18:00

Harnessing Multiple Large Language Models: A Survey on LLM Ensemble

Zhijun Chen, Xiaodong Lu, Jingzheng Li, Pengpeng Chen, Zhuoran Li, Kai Sun, Yuankai Luo, Qianren Mao, Ming Li, Likang Xiao, Dingqi Yang, Yikun Ban, Hailong Sun

LLM Ensemble---which involves the comprehensive use of multiple large language models (LLMs), each aimed at handling user queries during downstream inference, to benefit from their individual strengths---has gained substantial attention recently. The widespread availability of LLMs, coupled with their varying strengths and out-of-the-box usability, has profoundly advanced the field of LLM Ensemble. This paper presents the first systematic review of recent developments in LLM Ensemble. First, we introduce our taxonomy of LLM Ensemble and discuss several related research problems. Then, we provide a more in-depth classification of the methods under the broad categories of ``ensemble-before-inference, ensemble-during-inference, ensemble-after-inference'', and review relevant methods. Finally, we introduce related benchmarks and applications, summarize existing studies, and suggest future research directions. GitHub project link is: https://github.com/junchenzhi/Awesome-LLM-Ensemble.

SVAgent-based and Multi-agent SystemsAgent-based and Multi-agent SystemsOtherNatural Language ProcessingApplicationsMachine LearningEnsemble methods
#SV208

Session Aug 18 · 15:00–16:30 · Room 9

Poster Aug 18 · 16:30–18:00

Machine Learning Methods for Studying Latent Neural Activity Dynamics

Shufeng Kong, Fumei Deng, Xinyi Dong, Caihua Liu, Weiwei Chen, Yingheng Wang, Daniel Cao, Azahara Oliva, Antonio Fernandez-Ruiz, Carla P. Gomes

Recent developments in brain recording are driving a demand for machine learning tools capable of decoding the latent structure of large populations of neurons. In this paper, we provide a comprehensive survey that outlines the trajectory of Latent Variable Models (LVMs) from early state-space models to more recent deep generative models. We organize the literature into three closely related domains: (1) Single-Region Latent Dynamics, which includes models such as linear dynamical systems to more complex dynamics represented by Recurrent Neural Networks (RNNs) and Neural Ordinary Differential Equations (ODEs); (2) Multi-Region Communication, which employs probabilistic as well as subspace methods to study how information is transfered across different brain areas considering synaptic propagation delays and network connectivity; and (3) Behavior-Aligned Modeling, which seeks to disentangle neural activity related to task performance from other internal states via supervised or contrastive learning. Finally, we conclude and discuss benchmarks, evaluation criteria, and open challenges, such as the ability to identify causal links or directionality of communication, to facilitate future research for bridging interpretable brain dynamics with reliable neural decoding.

Multidisciplinary Topics and ApplicationsBioinformaticsMultidisciplinary Topics and ApplicationsComputational sustainabilityMultidisciplinary Topics and ApplicationsLife sciences
#SV220

Session Aug 18 · 11:30–12:30 · Room 9

Poster Aug 18 · 16:30–18:00

A Survey on Actionable Interpretability in Large Language Models

Jie Cai, Mafizur Rahman, James Enouen, Lijun Qian, Yan Liu

Large Language Models (LLMs) have become central to modern AI, with interpretability playing an increasingly important role in investigating the opaque and complex mechanisms encoded within billions of parameters and ensuring trustworthy deployment. However, descriptive interpretability approaches for LLMs remain largely post-hoc, explaining model behavior without actionable guidance for model improvement, thereby limiting their practical utility. Recent work has therefore shifted toward actionable LLM interpretability, emphasizing methods that connect internal mechanisms to interventions and model improvement rather than explanation alone. This survey reviews LLM interpretability through the lens of actionability, presenting a taxonomy of attributional and mechanistic approaches, along with emerging methods tailored to vision–language models (VLMs). We further examine how actionable interpretability supports downstream objectives such as hallucination mitigation, knowledge editing, fairness, safety, capability and efficiency. By positioning actionable interpretability as a pathway for better-guided LLM design and practice, this survey outlines key challenges and future directions toward trustworthy and controllable foundation models.

AI Ethics, Trust, FairnesExplainability and interpretabilityAI Ethics, Trust, FairnesFairness and diversityAI Ethics, Trust, FairnesTrustworthy AI
#SV226

Session Aug 20 · 10:00–11:00 · Room 1

Poster Aug 20 · 16:30–18:00

Deep Learning and Foundation Models for Weather Prediction: A Survey

Jimeng Shi, Azam Shirali, Bowen Jin, Sizhe Zhou, Wei Hu, Rahuul Rangaraj, Zhaonan Wang, Yanzhao Wu, Leonardo Bobadilla, Upmanu Lall, Shaowen Wang, Jiawei Han, Giri Narasimhan

Numerical weather prediction (NWP) models remain the cornerstone of atmospheric sciences. Yet, deep learning (DL) is challenging this paradigm by its ability to capture intricate spatio-temporal patterns and deliver ultra-fast predictions. Analogous to the foundation models (e.g., ChatGPT) in natural language processing, foundation models in the weather/climate domain have also been developed. This paper reviews DL and foundation models for weather prediction by highlighting their strengths and limitations. In particular, we carefully examine them from the perspective of their training paradigms: deterministic predictive learning, probabilistic generative learning, and pre-training & fine-tuning. For each paradigm, we summarize the underlying model architectures, training methods, and respective features. To facilitate further study, we provide a curated repository featuring categorized papers, open-source code, and benchmark datasets. Finally, we discuss and suggest potential research directions across new tasks and models in weather data storage and management, and operational deployment, further inspiring innovations in this rapidly evolving field. GitHub: https://github.com/JimengShi/DL-Foundation-Models-Weather.

Machine LearningApplicationsMultidisciplinary Topics and ApplicationsEnergy, environment and sustainabilityMultidisciplinary Topics and ApplicationsLife sciencesMultidisciplinary Topics and ApplicationsOther
#SV238

Session Aug 19 · 10:00–11:00 · Room 9

Poster Aug 19 · 16:30–18:00

A Resource-Aware Taxonomy of AI Bias Mitigation Techniques

Daniela Loreti, Roberta Calegari, Michela Milano

The literature on AI fairness has grown rapidly, proposing a large number of bias mitigation techniques that are commonly organized into pre-, in-, and post-processing methods. This pipeline-centric view offers an operational, lifecycle-based perspective on where mitigation can be applied. In deployment settings, however, practitioners also face an additional question: whether a mitigation family is applicable given the resources and access rights available in a concrete system.
In this survey, we use resources broadly to denote data access/control, training capability, and deployment-time interface/decision control.
Accordingly, we introduce a resource-aware taxonomy that complements existing taxonomies by classifying AI bias mitigation methods according to the conditions that make them practically implementable. We use this taxonomy to structure and reinterpret existing literature on the topic, highlighting which mitigation families remain feasible under resource constraints.

AI Ethics, Trust, FairnesBiasAI Ethics, Trust, FairnesEthical, legal and societal issuesAI Ethics, Trust, FairnesFairness and diversityAI Ethics, Trust, FairnesTrustworthy AI
#SV241

Session Aug 20 · 10:00–11:00 · Room 3

Poster Aug 20 · 16:30–18:00

Graph4LLM: A Systematic Survey of Graph-Enhanced Large Language Models

Xinyan Zhu, Cheng Yang, Qiuyu Wang, Zeyuan Guo, Yiding Wang, Zedi Liu, Chunchen Wang, Chuan Shi

Large Language Models (LLMs) excel in natural language processing (NLP) tasks. However, they suffer from inherent limitations due to their sequence-based nature, such as structural information loss and factual unreliability. Graphs, with the ability to explicitly model entities and relations, offer an effective way to address these shortcomings. To systematically synthesize the emerging research on graph-enhanced LLMs, this survey, Graph4LLM, examines how these methods integrate graphs into various stages of the LLM pipeline, including the input, model, and output phases. For each phase, we provide a detailed review of the key methods and techniques. We also introduce a wide range of application scenarios where Graph4LLM methods demonstrate significant potential. Finally, we outline the challenges and future research directions for developing more efficient and interpretable solutions.

Data MiningMining graphsNatural Language ProcessingLanguage models
#SV245

Session Aug 18 · 11:30–12:30 · Room 10

Poster Aug 18 · 16:30–18:00

A Survey on Quantitative Possibility Theory in Artificial Intelligence: A Convenient Epistemic Uncertainty Model

Sébastien Destercke, Didier Dubois, Henri Prade

Quantitative (or numerical) possibility theory offers a simple but yet very expressive setting for handling higher-order uncertainty and in particular imprecise probabilities. The paper surveys the basic ideas and notions underlying numerical possibility theory, its relation to the other uncertainty settings and its use in AI-related issues. Numerical possibility theory looks of interest for coping with imperfect statistical information, especially non-Bayesian statistics relying on likelihood functions and confidence intervals. Quantitative possibility theory can be used in inference, machine learning, tracking and information fusion, and finally preference modeling.

Uncertainty in AIGraphical modelsUncertainty in AINonprobabilistic modelsUncertainty in AIUncertainty representations
#SV252

Session Aug 19 · 10:00–11:00 · Room 1

Poster Aug 19 · 16:30–18:00

A Survey on Active Feature Acquisition Strategies

Linus Aronsson, Arman Rahbar, Morteza Haghir Chehreghani

Active feature acquisition (AFA) studies how a predictive system can sequentially choose which feature values to obtain for each instance to balance predictive accuracy against feature acquisition cost. This survey provides the first unified treatment of modern AFA through a partially observable Markov decision process (POMDP) formulation, showing that most existing methods can be understood as different approximations of the same underlying sequential decision problem. The survey proposes an up-to-date taxonomy organizing AFA into four categories: (i) embedded cost-aware predictors (notably cost-sensitive decision trees and ensembles), (ii) model-based methods that plan using learned probabilistic components, (iii) model-free methods that learn acquisition policies from simulated episodes, and (iv) hybrid methods that combine the strengths of model-based and model-free approaches. We argue that this POMDP-centric view clarifies connections among existing methods and motivates more principled algorithm design. Since much prior work is heuristic and lacks formal guarantees, we also outline routes to guarantees by connecting AFA to adaptive stochastic optimization. We conclude by highlighting open challenges and promising directions for future research.

Machine LearningActive learningMachine LearningCost-sensitive learningMachine LearningFeature extraction, selection and dimensionality reductionMachine LearningPartially observable reinforcement learning and POMDPsUncertainty in AISequential decision making
#SV253

Session Aug 20 · 15:00–16:30 · Room 2

Poster Aug 20 · 16:30–18:00

Sparsity in Federated Learning: A Survey

Alessio Mora, Adriano Guastella, Lorenzo Sani, Paolo Bellavista, Nicholas D. Lane

Conventional Federated Learning (FL) pipelines focus on the collaborative training of a global dense model across client devices. Sparsity has been increasingly adopted in FL, during or after local optimization, for a range of objectives, including reducing communication and computation costs, supporting unlearning, enhancing privacy guarantees, and improving local personalization. In this survey, we introduce a novel taxonomy of sparse FL methods that systematically organizes the existing literature according to their core objectives and methodological choices. Using this taxonomy, we analyze and categorize prior work, highlighting the underlying intuitions, technical mechanisms, benefits, and limitations of each class of approaches. Finally, we identify open challenges, expose research gaps, and extract guidance to help practitioners understand and adopt sparsity mechanisms in FL.

Machine LearningFederated learningMachine LearningLearning sparse models
#SV263

Session Aug 20 · 10:00–11:00 · Room 6

Poster Aug 20 · 16:30–18:00

A Comprehensive Survey of Interaction Techniques in 3D Scene Generation

Yuqi Li, Siwei Meng, Chuanguang Yang, Weilun Feng, Junming Liu, Zhulin An, Yikai Wang, Yingli Tian

3D scene generation has rapidly evolved, significantly promoting the innovation of content creation. In this context, interaction techniques serve as a pivotal bridge connecting user intent with the generative models, thereby enabling precise control, real-time feedback and personalized customization of complex 3D scenes. Existing literature reviews predominantly focus on general generative paradigms, or are limited to specific subdomains such as single-object modeling, while often overlooking the systematic classification of interaction mechanisms. To bridge this gap, this work presented a comprehensive survey of interaction techniques in 3D scene generation. We proposed a unified taxonomy that categorized existing methods into three primary paradigms: Interactive Generation, Interactive Editing, and Embodied Interaction. For each category, we analyzed representative methods in terms of controllability, interaction granularity, and physical consistency, and discussed their advantages and limitations. We further summarized commonly used datasets and evaluation protocols for interactive 3D scene generation. Finally, we discussed the future directions toward more physically grounded, multi-modal, and user-centered interactive 3D scene generation systems. A curated list of the related papers mentioned in this work can be found at Awesome-Interactive-Techniques-in-3D-Scene-Generation-Lists.

Agent-based and Multi-agent SystemsHuman-agent interactionComputer Vision3D computer visionMachine LearningGenerative models
#SV270

Session Aug 20 · 15:00–16:30 · Room 9

Poster Aug 20 · 16:30–18:00

Implicit Identity Technologies for LLMs: Fingerprinting and Watermarking Across Datasets, Models, and Generated Content

Bing Liu, Shunping Wang, Yufan Zhu, Xinyi Yu, Jing Huang, Linkang Du, Hongbin Pei, Wei Luo

Large language models (LLMs) require substantial investments and are increasingly deployed in high-stakes domains, making it critical to protect LLM-related assets and to trace their provenance. Identity technologies such as fingerprinting and watermarking address these needs by enabling ownership verification and attribution, and have rapidly emerged as an active research focus. However, existing techniques lack a systematic organisation, leading to two key issues, terminological confusion and isolated research lines, that have hindered the development of this research field. To this end, we present a comprehensive review of LLM identity techniques, focusing on fingerprinting and watermarking across the LLM lifecycle, including datasets, models, and generated content. We make three primary contributions. First, we introduce implicit identity (Implicit-ID for short) as a unifying abstraction and distinguish fingerprinting from watermarking. Second, we propose a lifecycle-based taxonomy that organises techniques by asset type and verification role, aligning each with asset protection or provenance.
Third, we establish an evaluation framework around three objectives---identifiability, robustness, and deployability. Together, these contributions structure the landscape of LLM identity techniques, clarify terminology, and highlight directions toward secure deployment.

AI Ethics, Trust, FairnesAI and law, governance, regulationAI Ethics, Trust, FairnesEthical, legal and societal issuesAI Ethics, Trust, FairnesSafety and robustness
#SV272

Session Aug 19 · 10:00–11:00 · Room 6

Poster Aug 19 · 16:30–18:00

LLM-Based Agents on the Edge: A Survey of Privacy, Scalability, Heterogeneity, and Autonomy

Nikita Agrawal, Ruben Mayer

Large language model (LLM)–based agents are increasingly being deployed beyond centralized cloud environments and toward the edge of the network, where they operate closer to data sources. This transition facilitates lower latency and enhances contextual awareness, privacy, and responsiveness, but it also introduces challenges that differ from traditional cloud-based agent deployments. This survey provides a systematic overview of LLM-based edge agents with a particular focus on four critical dimensions: privacy, scalability, heterogeneity, and autonomy. To facilitate structured analysis, we introduce a novel taxonomy along four axes: deployment, functional role, interaction, and adaptation. Based on our taxonomy, we analyze the challenges LLM-based agents face on the edge and discuss design solutions that can help mitigate possible issues. We further analyze the degree to which existing LLM-based edge agent frameworks achieve privacy, scalability, heterogeneity, and autonomy.

Agent-based and Multi-agent SystemsAgent communicationAgent-based and Multi-agent SystemsCoordination and cooperationAI Ethics, Trust, FairnesTrustworthy AIKnowledge Representation and ReasoningLearning and reasoning
#SV273

Session Aug 20 · 11:30–12:30 · Room 4

Poster Aug 20 · 16:30–18:00

Knowledge-Guided 3D CT Generation: A Conditioning-Centric Taxonomy

Francesca Pia Panaccione, Eugenio Lomurno, Matteo Matteucci

Controllable generation guided by external knowledge is a key requirement in modern generative deep learning applications, enabling the synthesis of samples with explicit constraints on semantic content, structural properties, and variability. In 3D Computed Tomography (CT), such control is essential for clinical applications, including data augmentation, privacy-preserving data sharing, and the simulation of specific anatomical or pathological scenarios. While research on conditional 3D CT generation has expanded rapidly, the diversity of existing approaches makes systematic comparison
difficult and obscures fundamental design choices.
In this survey, we propose a conditioning-centric taxonomy that organizes the literature along three orthogonal dimensions: the type of external knowledge (K), the knowledge integration paradigm (I), and the generative architecture (A). This factorization defines an explicit design space (K x I x A) that provides a unified perspective on prior work. Using this framework, we systematize existing methods, identify dominant trends and recurring design patterns, and highlight underexplored regions of the design space that point toward promising directions for future research.

Computer Vision3D computer visionComputer VisionBiomedical image analysisComputer VisionMultimodal learningMachine LearningDeep learning architecturesMachine LearningGenerative models

A Survey on 3D Skeleton Based Person Re-Identification: Taxonomy, Advances, Challenges, and Interdisciplinary Prospects

A Review on Test-Time Scaling for Agentic Large Language Models

Graph Rewiring in GNNs to Mitigate Over-Squashing and Over-Smoothing: A Survey

When Vision Meets Graphs: A Survey on Graph Reasoning and Learning

A Survey of Artificial Intelligence in Endoscopic Surgery Workflow: From Perception to Surgical Support

Towards Automated Kernel Generation in the Era of LLMs

Towards Vision-Spatiotemporal Fusion in Traffic Forecasting: A Survey on Cross-Modal Alignment

From Multimodal Perception to Strategic Reasoning: A Survey on AI-Generated Game Commentary

From Human Videos to Robot Manipulation: A Survey on Scalable Vision-Language-Action Learning with Human-Centric Data

Concept Bottleneck Models for Explainable Decision Making: A Survey of Progress, Taxonomy, and Future Directions

Constraining Generative Models: A Survey from the Constraint Programming Perspective

A Survey of Personalized Federated Foundation Models for Privacy-Preserving Recommendation

Tackling Multimodal Learning Challenges with Mixture-of-Expert: A Survey

AI-Enhanced Vein Biometrics: A Comprehensive Survey

A Comprehensive Survey of Deep Learning for Multivariate Time Series Forecasting: A Channel Strategy Perspective

Large Language Models for Blockchain Security and Analytics: A Survey

Dynamic Heterogeneous Graph Representation Learning: A Survey

A Survey of Joint Online-Offline Fine-tuning for Large Language Models

Multimodal Emotion Recognition with Large Language Models

From Time Series Analysis to Question Answering: A Survey in the LLM Era

Modeling Liquid Democracy: A Survey of the (Computational) Social Choice Literature

A Survey on the Verification of Reinforcement Learning Policies

Approximation Algorithms for the Shapley Value: Taxonomy and Properties

Spatial Pattern Matching: A Survey

A Survey on Value Alignment in Agentic AI Systems

Adaptive Reward Design in Reinforcement Learning: A Taxonomy and Survey

Beyond Scaling: A Survey of Data-Efficient Learning for LLM Agents

LLM-Based Intelligent Tutoring Systems: A Survey

An XAI View on Explainable ASP: Methods, Systems, and Perspectives

Accelerating Masked Diffusion Large Language Models: A Survey of Efficient Inference Techniques

Test-Time Adaptation for Graph Learning: A Systematic Survey

Learning PDE Solvers with Physics and Data: A Unifying View of Physics-Informed Neural Networks and Neural Operators

Harnessing Multiple Large Language Models: A Survey on LLM Ensemble

Machine Learning Methods for Studying Latent Neural Activity Dynamics

A Survey on Actionable Interpretability in Large Language Models

Deep Learning and Foundation Models for Weather Prediction: A Survey

A Resource-Aware Taxonomy of AI Bias Mitigation Techniques

Graph4LLM: A Systematic Survey of Graph-Enhanced Large Language Models

A Survey on Quantitative Possibility Theory in Artificial Intelligence: A Convenient Epistemic Uncertainty Model

A Survey on Active Feature Acquisition Strategies

Sparsity in Federated Learning: A Survey

A Comprehensive Survey of Interaction Techniques in 3D Scene Generation

Implicit Identity Technologies for LLMs: Fingerprinting and Watermarking Across Datasets, Models, and Generated Content

LLM-Based Agents on the Edge: A Survey of Privacy, Scalability, Heterogeneity, and Autonomy

Knowledge-Guided 3D CT Generation: A Conditioning-Centric Taxonomy