IJCAI-ECAI 2026 Accepted Papers · Special Track on AI4Tech: AI Enabling Critical Technologies
Presentation format
Every accepted paper is presented in two formats: an oral talk — which must be delivered in person in Bremen by one of the authors — and a poster during a dedicated poster session.
-
#AI4T17
Safe and Efficient Control: A Subgraph-Augmented Hierarchical Reinforcement Learning Framework for Dynamically Reconfigurable Battery Systems
Dynamically Reconfigurable Battery (DRB) systems employ power electronic switches to create dynamic topologies. They enable effective management of cell difference through real-time adjustment of cell connections. However, existing DRB control methods struggle to learn effective strategies due to sparse rewards, which arise from blind exploration in large topological action spaces and complex operational constraints. This leads to insufficient policy learning, making safety and balancing performance difficult to ensure in practical applications. To this end, we propose a Subgraph-Augmented Hierarchical Reinforcement Learning (SAHRL) framework. By combining hierarchical policies with topological structural knowledge, SAHRL effectively accelerates policy exploration and mitigates reward sparsity. Specifically, the high-level policy determines the strategic direction, while the subgraph-augmented low-level policy refines actions to meet operational constraints. The topological structural knowledge, extracted in the form of subgraphs and incorporated as an inductive bias, helps the agent focus on meaningful action patterns and reduce invalid exploration in the large action space. Extensive simulations and real-world experiments show that SAHRL achieves safe and efficient balancing. Notably, it increases the energy release by 10.56% compared to conventional methods in real-world applications.Domain-specific AI4TechOther AI4Tech applicationsEmerging AI4Tech Emerging AI4Tech areas -
#AI4T22
DEPLOY-RL: Active Boundary Discovery and Conservative Certification for Deployable Reinforcement Learning in Safety-Critical Continuous Processes
Reinforcement learning (RL) policies often outperform classical controllers in simulation, yet rarely reach production in safety-critical processes. The barrier is that there is no principled way to answer “Is this policy safe to deploy?” with statistical guarantees. We introduce DEPLOY-RL, a post-training certification framework built on one key insight: deployment certification requires discovering where failures occur (boundary discovery), not measuring how much everywhere (uniform reconstruction). Our contributions: (1) a contract-coupled acquisition function that concentrates sampling on certification-critical boundaries, achieving ≈ 2× sample efficiency with a semi-empirical ambiguity reduction bound (domain-calibrated convergence guarantee); (2) conformal risk control providing finite-sample false-go guarantees (≤ α) under explicit deployment contracts; (3) a three-way decision framework (Deploy/No-Deploy/Abstain) with fail-safe PID/MPC fallback. In simulations on papermaking (industrial digital twin) and Tennessee Eastman (public benchmark), DEPLOY-RL achieves 4.4% false-go rate (vs. 8.6% for the best baseline) while retaining 88.6% policy coverage—the only method achieving <5% false-go with >85% coverage among 14 baselines under our evaluation protocol.Advanced AI4TechAI4Tech foundationsAdvanced AI4TechData-driven AI4TechDomain-specific AI4TechAI4ManufacturingDomain-specific AI4TechAI4Safety -
#AI4T43
Translating Latent Representations for Money Laundering Detection
Anti-money laundering (AML) systems are important for safe economic trade and for the fight against financial crime. Recently, a number of AML algorithms based on graph neural networks (GNNs) and graph transformers (GTs) have been proposed. Compared to traditional machine learning solutions, these methods have been shown to achieve significantly better detection results. Yet, the state-of-the-art AML algorithms have a key limitation: they fail to jointly address money laundering classification and money laundering sub-network discovery, despite their strong theoretical connection. To bridge this gap, we propose a translation-based AML system (TAML) that is capable of jointly solving both problems within the same latent space. Our extensive experimental evaluation on multiple datasets demonstrates the superiority of TAML over the state-of-the-art in both tasks.Domain-specific AI4TechAI4Finance -
#AI4T44
DUALFloodGNN: Physics-informed Graph Neural Network for Operational Flood Modeling
Flood models inform strategic disaster management by simulating the spatiotemporal hydrodynamics of flooding. While physics-based numerical flood models are accurate, their substantial computational cost limits their use in operational settings where rapid predictions are essential. Models designed with graph neural networks (GNNs) provide both speed and accuracy while having the ability to process unstructured spatial domains. Given its flexible input and architecture, GNNs can be leveraged alongside physics-informed techniques with ease, significantly improving interpretability and generalizability. We introduce a novel flood GNN architecture, DUALFloodGNN, which embeds physical constraints at both global and local scales through explicit loss terms. The model jointly predicts water volume at nodes and flow along edges through a shared message-passing framework. To improve performance for autoregressive inference, model training is conducted with a multi-step loss enhanced with dynamic curriculum learning. Compared with standard GNN architectures and state-of-the-art GNN flood models, DUALFloodGNN achieves substantial improvements in predicting multiple hydrologic variables (e.g., water volume, flow, and depth) while maintaining high computational efficiency. The model is open sourced at https://github.com/acostacos/dual_flood_gnn. The dataset is open sourced at https://doi.org/10.25910/9xav-0s86.Advanced AI4TechData-driven AI4TechAdvanced AI4TechDeep AI4TechAI4Tech infrastructure/systemsAI social systemsDomain-specific AI4TechAI4Home and AI4CityDomain-specific AI4TechAI4Safety -
#AI4T45
Bridging the Data Scarcity in Venous Thromboembolism Detection: A Deep Learning Framework for Large-scale Irregular Clinical Time Series
Venous thromboembolism (VTE) is a common and life-threatening complication in cancer patients after treatment. Early risk assessment and detection of VTE primarily rely on clinical indicators, such as blood test results. However, existing studies are limited to static or snapshot-based models, failing to capture the evolving dynamics of disease progression, as deep time-series modeling is hindered by the lack of longitudinal clinical data. To address this gap, we introduce CliTsVTE, a large-scale clinical time-series dataset curated for VTE modeling, comprising 501,063 samples from 26,022 patients over seven years across nine cancer types. The dataset contains continuous time gaps between consecutive time points. Unlike many benchmarks, CliTsVTE reflects real-world clinical settings and presents unique challenges in continuous irregular time-series modeling with long-term irregularity and varying data granularity, which makes missingness significantly consequential. To tackle this, we propose a deep learning framework integrating multiple sequential backbones with an adversarially regularized autoencoder (ARAE) that learns latent representations to eliminate missingness. Experiments on CliTsVTE show that our best model achieves 88.7% accuracy and an AUC of 0.952, significantly outperforming traditional time-point models and regular time-series benchmarks. These results establish a strong benchmark for deep modeling of continuous irregularity in clinical time-series data and highlight the potential of AI-driven large-scale clinical datasets in solving real-world medical research challenges.Advanced AI4TechData-driven AI4TechAdvanced AI4TechDeep AI4TechDomain-specific AI4TechAI4Care and AI4HealthDomain-specific AI4TechAI4Biotech -
#AI4T55
Spherical Physics-Informed Neural Operator with Multi-Scale Coupling for Meteorological Downscaling
Meteorological downscaling is crucial for high-resolution regional climate forecasting and disaster early warning. While neural operators have emerged as a promising paradigm for modeling complex spatiotemporal mappings, existing frameworks often struggle with spherical manifold geometric distortions, inherent atmospheric multi-scale coupling mismatches, and lack of explicit atmospheric laws. We propose the Spherical Physics-informed Neural Operator, which utilizes a Spherical Laplacian Decomposition to partition atmospheric fields into hierarchical frequency components, maintaining exact point-wise correspondence across scales. To evaluate these representations at arbitrary locations, we introduce a localized spherical integral operator that approximates continuous kernel transforms via geometry-aware attention. Dynamical consistency is further enforced by embedding differentiable constraints into the learning process. Extensive experiments demonstrate that our framework attains superior accuracy and zero-shot generalization across various meteorological variables and unseen queries, representing a robust and interpretable solution for global-to-regional meteorological downscaling.Domain-specific AI4TechOther AI4Tech applications -
#AI4T68
Leveraging Implicit Contexts via LLM–Graph Fusion for Temporal Knowledge Graph Reasoning
Temporal knowledge graph (TKG) reasoning is critical for modeling and forecasting the evolution of real-world events. Existing TKG construction pipelines transform raw text into structured temporal quadruples as graph facts. However, in this process, they often fail to preserve reasoning-relevant contextual semantics from the original corpus that cannot be explicitly represented as graph facts, leaving such implicit contextual information unused by current temporal reasoning models. To address this limitation, we propose a Textual-Temporal Graph Fusion Network (TTGFN), a context-aware framework that leverages implicit contexts from text via LLM-based semantic encoding and fuses it with structure-constrained temporal graph representations for reasoning. To the best of our knowledge, this is the first work to systematically leverage LLMs to reuse previously overlooked implicit contextual information and incorporate it into temporal knowledge graph reasoning, substantially improving model performance. Extensive experiments conducted on three TKG reasoning benchmark datasets demonstrate that TTGFN outperforms the state-of-the-art approaches, with Hits@1 gains of 20.85% on ICEWS14 dataset, 31.14% on ICEWS05-15 dataset, and 24.22% on ICEWS18, respectively.Advanced AI4TechData-driven AI4TechAdvanced AI4TechGenerative and LLMs-driven AI4Tech -
#AI4T79
Beyond Isolated Investor: Predicting Startup Success via Roleplay-Based Collective Agents
Due to the high value and high failure rates of startups, predicting their success is a critical challenge. Existing approaches typically model startup success from a single decision-maker's perspective, overlooking the collective dynamics that dominate real-world venture capital (VC) decision-making. We propose SimVC-CAS, a collective agent system that simulates VC decisions as a multi-agent interaction process. By designing role-playing agents and a GNN-based supervised interaction module, we reformulate startup financing prediction as a group decision-making task, capturing both enterprise fundamentals and investor network dynamics. Each agent represents an investor with distinct traits and preferences, enabling heterogeneous evaluations and realistic information exchange over a graph-structured co-investment network.
Using both proprietary and public VC data with strict anti-leakage controls, we show that SimVC-CAS significantly improves predictive performance, achieving approximately 25% relative improvement in average precision@10, while exhibiting consistency with real investor decisions. The interaction mechanism is particularly effective for network-central startups, confirming the importance of network in VC decision-making. Analysis of agents' reasoning for decision changes further reveals how network environment influence decision quality, demonstrating the system's interpretability. Our approach may generalize to broader group decision-making scenarios. Our code is available at https://github.com/ZhangDataLab/SimVC-CAS.Domain-specific AI4TechAI4Finance -
#AI4T95
Conspiracy Spoofing Detection via Structure-Augmented Generative Graph Model
Detecting spoofing in financial trading is a critical data mining task. Traditional machine learning models often focus on individual node features, failing to capture the contextual relationships among interconnected nodes. Graph-based methodologies have enhanced this by effectively integrating relational data. Recent advancements in fraud detection demonstrate substantial performance gains by incorporating structure information into detection models. However, spoofing transactions often exhibit a distribution shift compared to historical transactions, rendering historical data less effective. Instead, certain trading patterns, such as motif structures, consistently manifest in transaction graphs regardless of distribution shift, providing a robust alternative for analysis. Motif structures, particularly node motifs, are essential for capturing higher-order interactions and structural patterns within transaction graphs. This paper introduces the Structure-Augmented Generative Graph Model (SAG2M) to address the challenge of detecting conspiracy spoofing through a substructure frequency-augmented detection method. Specifically, our approach extracts the frequency of subgraph patterns among neighboring nodes, leveraging an enumeration algorithm to efficiently identify node orbit data. The extracted motif frequencies are then encoded into a structure-augmented generative framework, enabling detailed structural representations of each transaction (node). Subsequently, a temporal and heterogeneous graph generation and aggregation scheme is applied to collect neighborhood node information, uncovering conspiracy spoofing patterns effectively. Our experiments on datasets such as Amazon, Yelpchi, and T-Finance demonstrate that SAG2M outperforms existing models in detection accuracy. A case study focusing on conspiracy spoofing detection further highlights the model’s superior effectiveness in identifying such complex fraudulent behaviors.Advanced AI4TechData-driven AI4TechDomain-specific AI4TechAI4Finance -
#AI4T98
HieraMix: A Hierarchical MLP-Mixer for Large-Scale Traffic Forecasting
Traffic forecasting task is significant to modern urban management. Recently, there is growing attention on large-scale forecasting, as it better reflects the complexity of real-world traffic networks. However, existing models often exhibit quadratic computational complexity, making them impractical for large-scale real-world scenarios. In this paper, we propose a novel framework, Spatio-Temporal Hierarchical Mixer (HieraMix), which leverages an all-MLP architecture for efficient and effective large-scale traffic forecasting. HieraMix employs a hierarchical spatiotemporal mixing block to extract multi-resolution features through bottom-up aggregation and top-down propagation. Furthermore, an adaptive region mixer generates transformation matrices based on regional semantics, enabling our model to dynamically capture evolving spatiotemporal patterns for different regions. Extensive experiments conducted on four large-scale real-world datasets demonstrate that the proposed method not only achieves state-of-the-art performance but also exhibits competitive computational efficiency.Domain-specific AI4TechAI4Transport -
#AI4T104
LLM-guided Cutting-plane Management for Mixed-integer Linear Programming
Cutting planes are central to mixed-integer linear programming (MILP) solving, yet their effectiveness hinges on expert tuning of separator configurations and hard-crafted cut-selection heuristics, creating a high barrier for non-specialists. Learning-based methods can reduce manual effort, but typically require large training datasets and often generalize poorly beyond the instance classes they are trained on. We propose an LLM-guided cutting-plane management framework that integrates large language models into the MILP solving pipeline. First, using chain-of-thought (CoT) prompting, the LLM infers an instance-specific separator configuration, deciding which separators to activate and how to set key parameters from the problem type and structural features. Then, it translates the evolving branch-and-bound state in natural language to perform stage-aware cut selection, choosing a high-quality subset of cuts that tightens the relaxation and improves overall solver performance. Leveraging the LLM's reasoning capabilities and rich background knowledge, our work removes dependence on domain-specific training data and substantially reduces reliance on expert-crafted configurations. Experiments show consistent improvements over SCIP's default settings, hard-crafted heuristics, and recent learning-based cut selection baselines.Advanced AI4TechGenerative and LLMs-driven AI4TechAdvanced AI4TechOther advances -
#AI4T108
A Durable Machine Unlearning Framework to Nullify Recall of Sensitive Data on Incremental Training
The advancement of data privacy regulations has spurred the development of Machine Unlearning (MU), which is designed to remove the influence of sensitive data from a trained model and results in an unlearned model (ULM). Despite rapid progress in MU techniques, their vulnerabilities remain underexplored, which poses risks due to potential leakage of unlearned information. In realistic scenarios, ULMs always need to be incrementally trained with the newly collected data samples, which can lead to the consequences of recalling sensitive information if the new dataset contains similar or even the same unlearned samples. To address this issue, we devise a Durable Unlearning Enhancement (DUE) framework to avoid restoring unwanted sensitive information from incremental training data samples. The DUE framework has three key components that identify sensitive samples and suppress their gradients to update ULMs. Extensive experiments on state-of-the-art MU methods across multiple real-world datasets show that the proposed DUE framework can effectively nullify the recall of sensitive information after MU, and even improve the performance of ULMs. Consequently, our work establishes a new fundamental research direction in safe training against MU vulnerabilities.Advanced AI4TechAI4Tech foundationsAdvanced AI4TechData-driven AI4TechAdvanced AI4TechDeep AI4Tech -
#AI4T112
MAgSeg: Segmentation of Agricultural Landscapes in High-Resolution Satellite Imagery using Multimodal Large Language Models
Agricultural landscape segmentation in the Global South is challenging as it is characterized by fragmented plots, high intra-class variance, and a scarcity of labeled training data. Recent advances in segmentation have been made by Multimodal Large Language Models (MLLMs). However, current approaches encounter critical context length bottlenecks and a domain alignment gap in understanding satellite features. We address these limitations through MAgSeg, a novel, decoder-free MLLM segmentation approach. MAgSeg is an architecturally efficient approach that enables standard MLLMs to perform segmentation of complex smallholder agricultural landscapes from high-resolution satellite imagery, without requiring auxiliary vision decoders. We introduce a novel instruction tuning data format designed to enable scalable fine-tuning and post-training on high resolution satellite imagery, which enables MAgSeg to learn from the global context of the image while generating text tokens for only a patch within the image. Extensive evaluations on datasets spanning three countries in the Global South demonstrate that MAgSeg significantly outperforms state-of-the-art MLLM baselines, offering a scalable solution to map smallholder agricultural environments.Domain-specific AI4TechAI4Agriculture -
#AI4T120
Belief-Contraction-Driven Active Inverse Source Localization and Characterization
Active inverse source localization and characterization (ISLC) in dynamic fields requires sequential decision making under partial observability, where a mobile sensor must infer latent source parameters from sparse, noisy readings. We introduce a belief-contraction-driven approach that unifies inference, stopping, and control. An attention-augmented particle filter stabilizes Bayesian belief updates through ESS-based resampling, feature-aware sparse attention smoothing, and Metropolis–Hastings rejuvenation that preserves the filtering posterior. Belief contraction (posterior dispersion) defines both a termination rule and a goal-aligned intrinsic reward, enabling reinforcement learning without distance-to-source shaping. Across seven field modalities, spatial out-of-distribution tests, and nonstationary source shifts, our agent (ATT-PFRL) achieves higher completion, faster convergence, and more accurate localization than planning and RL+Bayes baselines under similar computation. Fixed-trajectory studies also show improved ESS and lower RMSE, isolating the benefit of the inference layer.Advanced AI4TechAI4Tech foundationsAdvanced AI4TechData-driven AI4TechAdvanced AI4TechDeep AI4Tech -
#AI4T132
PENTESTLLMAGENT: A Task Dependency Graph Planning-Based Multi-Agent Framework for Automated Penetration Testing
Fully autonomous IP-to-Root penetration testing remains challenging for LLM agents. We conduct an exploratory study on 10 LLMs and introduce AutoPentest-Bench, an end-to-end benchmark with 13 VulnHub targets and 93 sub-tasks. From 130 interaction logs, we identify three challenges: Rigid Strategy, Contextual Forgetting, and Command Generation Hallucination. To address them, we propose PentestLLMAgent, which integrates a Task Dependency Graph (TDG) for dynamic planning and backtracking; a Hierarchical Multi-Agent Architecture (HMA) with function-calling-based tool invocation, output filtering, and semantic compression, and Executable Knowledge-Guided Command Generation (EKG-CG) for retrieving and executing pre-validated, environment-compatible commands. Evaluations demonstrate strong effectiveness: on end-to-end AutoPentest-Bench, PentestLLMAgent achieves a 77% success rate; on AutoPT’s web exploitation benchmark, it attains a 95% overall pass rate; and on the privilege escalation benchmark, it achieves a 100% success rate. In realistic end-to-end runs, it averages 10.9 minutes, 36 interaction rounds, and 69.6K tokens per target. The code, benchmark, and executable knowledge base are publicly available at https://github.com/sanbai123/PentestLLMAgent_code-and-videos.Advanced AI4TechGenerative and LLMs-driven AI4TechDomain-specific AI4TechAI4Security -
#AI4T138
Physics-Guided Geometric Diffusion for Macro Placement Generation
Macro placement is a pivotal stage in VLSI physical design, fundamentally determining the overall chip performance. Recent data-driven placement methods have demonstrated significant potential, yet they often struggle to handle sequential dependencies and to balance topological connectivity with physical constraints. To bridge this gap, we propose MacroDiff+, a physics-guided geometric diffusion framework. Specifically, we design a dual-domain denoising architecture that couples topological connectivity encoded by heterogeneous GNNs with global geometric context modeled by a Transformer. Furthermore, we introduce Physics-Guided Sampling, an inference strategy that actively steers the generation using explicit gradients to ensure both statistical plausibility and physical validity. On the ISPD2005 MMS benchmarks, MacroDiff+ outperforms state-of-the-art baselines with a 6.1–6.2% reduction in wirelength. Notably, it exhibits superior stability and scalability on large-scale designs where prior methods fail to converge. The source code is provided at https://github.com/jhy00n/MacroDiff-plus.Advanced AI4TechGenerative and LLMs-driven AI4TechAI4Tech infrastructure/systemsAI chips, AI sensors, AI computersDomain-specific AI4TechAI4ManufacturingEmerging AI4Tech Emerging AI4Tech areas -
#AI4T139
MORL-CA: Dynamic Multi-Objective Reinforcement Learning for Chlor-Alkali Process Optimization Under Time-Varying Conditions
Chlor-alkali production is a large-scale industrial process whose operating conditions and equipment states evolve over time. Its process optimization requires ongoing trade-offs among conflicting objectives such as product yield, energy consumption, and equipment life. Existing optimization approaches are typically static and must be re-optimized after environmental changes, limiting their real-world applicability. In this work, we model the problem as a dynamic multi-objective sequential decision-making problem that continuously tracks a time-varying Pareto set under changing conditions. We propose MORL-CA, a multi-objective reinforcement learning framework that integrates offline pretraining on historical data with constrained online policy refinement. MORL-CA introduces a state-aware adaptive objective weighting mechanism within a multi-critic actor-critic architecture, enabling localized Pareto-improving policy updates while satisfying operational and safety constraints. Extensive experiments in an environment conducted from real chlor-alkali data demonstrate that MORL-CA achieves superior Pareto solution quality and smoother adaptation to dynamics compared with state-of-the-art multi-objective optimizers and MORL baselines.Domain-specific AI4TechAI4ManufacturingDomain-specific AI4TechOther AI4Tech applications -
#AI4T149
BrainCGT: A Brain Graph Transformer for Modeling Causal Connectivity in Neurological Disorder Diagnosis
Brain connectivity analysis is a fundamental tool for identifying biomarkers and understanding of neurological disorders. Most existing approaches employ graph transformers over undirected functional connectivity networks, which are typically estimated using correlation statistics. Although effective for capturing statistical associations, these models do not represent directed interactions between brain regions that arise from causal relationships. As a result, direction-specific disease mechanisms are not explicitly modeled, and interpretability is often limited. To address this gap, we present BrainCGT, a brain graph transformer designed to model causal connectivity inferred from fMRI time-series data. In this framework, brain networks are modeled as directed graphs with a modular organization, where nodes correspond to individual brain regions and directed edges reflect causal flow of information between them. Direction-aware node representations together with direction-biased attention mechanisms allow the model to capture asymmetric interactions across regions. Experimental results on three large-scale fMRI datasets demonstrate that BrainCGT achieves consistently better performance than existing graph-based methods for neurological disorder classification. In addition, examination of the learned attention structures shows correspondence with established neurobiological pathways, suggesting improved interpretability. These results highlight the importance of incorporating causal directionality into brain graph transformer architectures for robust and interpretable neuroimaging analysis.Advanced AI4TechData-driven AI4TechAdvanced AI4TechNeuro AI4TechDomain-specific AI4TechAI4Care and AI4HealthDomain-specific AI4TechOther AI4Tech applicationsEmerging AI4Tech Emerging AI4Tech areas -
#AI4T151
ACCFormer: Predicting Analog Circuit Performance Metrics via Topology-Aware Transformers
Reusing and migrating analog circuit intellectual property (IP) across process nodes poses a significant challenge in modern chip design. Efficient and generalizable circuit performance prediction methods for analog circuits are crucial to achieving this goal. Current data-driven approaches typically rely on manually designed features, which perform poorly on unseen circuit architectures and struggle to model the inherent structural relationships within analog designs. To address these challenges, we propose ACCFormer, a novel topology-aware Transformer framework for predicting performance metrics of analog circuit. Our model combines device parameters with connectivity data to learn topology-aware representations, followed by a performance-oriented cross-attention mechanism where trainable metric queries adaptively focus on the most critical devices for each target parameter. Validated across different process nodes, our model achieves state-of-the-art prediction accuracy and demonstrates strong cross-process adaptability, highlighting its potential to accelerate IP reuse and reduce design cycles.Advanced AI4TechData-driven AI4TechAI4Tech infrastructure/systemsAI chips, AI sensors, AI computersDomain-specific AI4TechOther AI4Tech applications -
#AI4T168
HT-Transformer: Event Sequences Classification by Accumulating Prefix Information with History Tokens
Deep learning has achieved strong results in modeling sequential data, including event sequences, temporal point processes, and irregular time series. Recently, transformers have largely replaced recurrent networks in these tasks. However, transformers often underperform recurrent networks in classification tasks that aim to predict future targets, such as churn, user reactions, or treatment response. The reason behind this performance gap remains largely underexplored. In this paper, we identify a key limitation of transformers: the lack of a single vector representation that compactly summarizes the evolving state of a sequence. We further show that commonly used contrastive embeddings are poorly suited to capturing the local context needed for accurate forward-looking prediction. To address these challenges, we introduce history tokens, a novel concept that enables the accumulation of historical information during next-token prediction pretraining. Our approach significantly improves transformer-based models, achieving impressive results in finance, e-commerce, and healthcare tasks. The code is publicly available: https://github.com/ivan-chai/pretpp.Advanced AI4TechDeep AI4TechDomain-specific AI4TechAI4Care and AI4HealthDomain-specific AI4TechAI4Customer and AI4MarketDomain-specific AI4TechAI4Finance -
#AI4T199
DiffLOB: Diffusion Models for Counterfactual Generation in Limit Order Books
Modern generative models for limit order books (LOBs) can reproduce realistic market dynamics, but they remain fundamentally passive: they either model what typically happens without accounting for hypothetical future market conditions, or they require interaction with another agent to explore alternative outcomes. This limits their usefulness for stress testing, scenario analysis, and decision-making. We propose DiffLOB, a regime-conditioned Diffusion model for controllable and counterfactual generation of LOB trajectories. DiffLOB explicitly conditions the generative process on future market regimes—including trend, volatility, liquidity, and order-flow imbalance, which enables the model to answer counterfactual queries of the form: “If the future market regime were X instead of Y, how would the limit order book evolve?” We introduce the first systematic evaluation framework for counterfactual LOB generation consisting of three criteria: (1) Realism, measuring how well generated trajectories can reproduce marginal distributions, temporal dependence structure and regime variables; (2) Counterfactual validity, testing whether interventions on future regimes induce consistent changes in the generated LOB dynamics; (3) Counterfactual usefulness, assessing whether synthetic counterfactual trajectories improve downstream prediction of future market regimes.Domain-specific AI4TechAI4Finance -
#AI4T220
GraphPerf-RT: Graph-Driven Performance Modeling with Calibrated Uncertainty for OpenMP Scheduling on Heterogeneous Embedded SoCs
Autonomous AI agents on embedded platforms require real-time, risk-aware scheduling under resource and thermal constraints. Classical heuristics struggle with workload irregularity, tabular regressors discard structural information, and model-free reinforcement learning (RL) risks overheating. We introduce GraphPerf-RT, an AI technology achieving deep learning accuracy at heuristic speeds (2-7ms). GraphPerf-RT is, to our knowledge, the first graph-grounded infrastructure unifying task DAG topology, CFG-derived code semantics, and runtime context (per-core DVFS, thermal state, utilization) in a heterogeneous graph with typed edges encoding precedence, placement, and contention. The architecture supports multi-task evidential heads with Normal-Inverse-Gamma uncertainty; we validate on makespan prediction for risk-aware scheduling. Experiments on three ARM platforms (Jetson TX2, Orin NX, RUBIK Pi) achieve R^2 = 0.81 on log-transformed makespan with Spearman rho = 0.95 and conservative uncertainty calibration (PICP = 99.9% at 95% confidence). Integration with four RL methods demonstrates that multi-agent model-based RL with GraphPerf-RT as the world model achieves 66% makespan reduction and 82% energy reduction versus model-free baselines, with zero thermal violations.Advanced AI4TechAI4Tech foundationsAdvanced AI4TechData-driven AI4TechAdvanced AI4TechDeep AI4Tech -
#AI4T231
Towards Scalable Metaverse Systems with Social-Aware VR Displays
The Metaverse is envisioned to support immersive, large-scale social interactions via virtual reality (VR) displays. However, scalability remains a major bottleneck: as the number of concurrent users grows, tracking and updating display content incurs quadratic overhead, often limiting a shared virtual space to only a few dozen participants. Our key observation is that user attention in social VR is highly selective, with users primarily focusing on socially relevant peers rather than all visible users. Motivated by this observation, we propose SAGE, a social-aware graph-based VR display framework that enables personalized displays based on inferred social relevance. SAGE introduces a dual-graph learning architecture to jointly model long-term social structures and short-term spatiotemporal co-presence patterns, generating complementary interest scores for display prioritization. Based on these scores, we formulate scalable VR display support as a multi-dimensional resource allocation problem and design a lightweight coordination mechanism with provable guarantees, including incentive compatibility and individual rationality. Experiments on Metaverse datasets show that SAGE improves interaction-relevance prediction by 11.64% and increases social welfare by up to 2.4× compared to state-of-the-art schemes. It scales to support up to 1,000 concurrent users and remains robust against strategic manipulation.Advanced AI4TechMetaverse AI4TechDomain-specific AI4TechOther AI4Tech applications -
#AI4T240
Structured Discrete Graph Generation Model for Fragmented Image Recovery
Fragmented image recovery is of significant importance in computer vision, such as cultural relic and artwork restoration, archival document recovery, and digital forensics. The goal is to recover the original image topology from an unordered set of fragments and spatially align and stitch them together. The adjacency relationships among fragments are discrete, sparse, and highly structured, making it difficult for traditional methods to effectively handle global topological consistency. To address this challenge, we propose a fragment adjacency recovery method based on a conditional graph diffusion model. First, we perform discrete denoising pretraining with structural masking to learn structure-aware node representations from perturbed adjacency matrices, using graph neural networks for message passing. Building on this, we design a masked discrete diffusion process tailored for fragment reconstruction, which progressively restores the connectivity between fragments. Furthermore, to enhance the controllability of the generation process, we introduce a topology-guided mechanism that steers the generation of adjacency structures via a topological scoring function, ensuring that the reconstructed fragment graph satisfies global topological constraints. Experimental results demonstrate that our method achieves state-of-the-art performance on hand-torn calligraphy, painting replica datasets and document datasets, outperforming existing approaches in both accuracy and robustness.Domain-specific AI4TechAI4Arts and AI4Law -
#AI4T243
Physics-in-the-Loop: A Hybrid Agentic Architecture for Validated CAD Engineering Design
Large Language Models (LLMs) can generate Computer-Aided Design (CAD), yet lack physical comprehension required for reliable engineering design. Instead of attempting to implicitly learn physical laws from data, we propose a Hybrid Agentic-Physical Architecture that embeds validated knowledge-based engineering tools directly into the decision-making loop of autonomous AI agents. In this framework, engineering design is formulated as a closed-loop, sequential decision-making process guided by explicit physical verification. Based on a load case, dedicated agents iteratively plan, generate, evaluate, and revise engineering designs using knowledge-based tools as a feedback signal. We introduce a benchmark dataset and metrics for assessing functional validity in generative CAD. Our system generates more complex and physically verified designs, with a 4.2x increase in structural complexity and improving compile rate by 3.5% compared to similar agentic methods. The codebase, prompts and dataset will be made publicly available to support reproducibility and future research.Domain-specific AI4TechAI4ManufacturingAdvanced AI4TechGenerative and LLMs-driven AI4TechAI4TDomain-specific AI4Tech
