arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1652
2601.00911 2026-04-23 cs.CR cs.AI cs.ET cs.HC cs.LG

Device-Native Autonomous Agents for Privacy-Preserving Negotiations

Joyjit Roy, Samaresh Kumar Singh

Comments 9 pages, 6 figures, 9 tables. This version updates metadata after publication in IEEE Xplore

详情
Journal ref
2026 IEEE SoutheastCon, Huntsville, AL, USA, 2026
英文摘要

Automated negotiations in insurance and business-to-business (B2B) commerce encounter substantial challenges. Current systems force a trade-off between convenience and privacy by routing sensitive financial data through centralized servers, increasing security risks, and diminishing user trust. This study introduces a device-native autonomous Agentic AI system for privacy-preserving negotiations. The proposed system operates exclusively on user hardware, enabling real-time bargaining while maintaining sensitive constraints locally. It integrates zero-knowledge proofs to ensure privacy and employs distilled world models to support advanced on-device reasoning. The architecture incorporates six technical components within an Agentic AI workflow. Agents autonomously plan negotiation strategies, conduct secure multi-party bargaining, and generate cryptographic audit trails without exposing user data to external servers. The system is evaluated in insurance and B2B procurement scenarios across diverse device configurations. Results show an average success rate of 87 %, a 2.4x reduction in latency relative to cloud baselines, and strong privacy preservation through zero-knowledge proofs. User studies show 27 % higher trust scores when decision trails are available. These findings establish a foundation for trustworthy autonomous agents in privacy-sensitive financial domains.

2512.15808 2026-04-23 q-bio.QM cs.AI cs.CV cs.LG

Foundation Models in Biomedical Imaging: Turning Hype into Reality

Amgad Muneer, Kai Zhang, Ibraheem Hamdi, Rizwan Qureshi, Muhammad Waqas, Shereen Fouad, Hazrat Ali, Syed Muhammad Anwar, Jia Wu

Comments 9 figures and 3 tables

详情
英文摘要

Foundation models (FMs) are driving a prominent shift in biomedical imaging from task-specific models to unified backbone models for diverse tasks. This opens an avenue to integrate imaging, pathology, clinical records, and genomics data into a composite system. However, this vision contrasts sharply with modern medicine's trajectory toward more granular sub-specialization. This tension, coupled with data scarcity, domain heterogeneity, and limited interpretability, creates a gap between benchmark success and real-world clinical value. We argue that the immediate role of FMs lies in augmenting, not replacing, clinical expertise. To separate hype from reality, we introduce REAL-FM (Real-world Evaluation and Assessment of Foundation Models), a multi-dimensional framework for assessing data, technical readiness, clinical value, workflow integration, and responsible AI. Using REAL-FM, we find that while FMs excel in pattern recognition, they fall short in causal reasoning, domain robustness, and safety. Clinical translation is hindered by scarce representative data for model training, unverified generalization beyond oversimplified benchmark settings, and a lack of prospective outcome-based validation. We further examine FM reasoning paradigms, including sequential logic, spatial understanding, and symbolic domain knowledge. We envision that the path forward lies not in a monolithic medical oracle, but in coordinated subspecialist AI systems that are transparent, safe, and clinically grounded.

2512.12463 2026-04-23 stat.ML cs.LG math.ST stat.TH

Understanding Overparametrization in Survival Models through Interpolation

Yin Liu, Jianwen Cai, Didong Li

详情
英文摘要

Classical statistical learning theory predicts a U-shaped relationship between test loss and model capacity, driven by the bias-variance trade-off. Recent advances in modern machine learning have revealed a more complex pattern, double-descent, in which test loss, after peaking near the interpolation threshold, decreases again as model capacity continues to grow. While this behavior has been extensively analyzed in regression and classification, its manifestation in survival analysis remains unexplored. This study investigates overparametrization in four representative survival models: DeepSurv, PC-Hazard, Nnet-Survival, and N-MTLR. We rigorously define interpolation and finite-norm interpolation, two key characteristics of loss-based models to understand double-descent. We then show the existence (or absence) of (finite-norm) interpolation of all four models. Our findings clarify how likelihood-based losses and model implementation jointly determine the feasibility of interpolation and show that overparametrization should not be regarded as benign for survival models. All theoretical results are supported by numerical experiments that highlight the distinct generalization behaviors of survival models.

2511.17265 2026-04-23 cs.AR cs.AI cs.ET cs.PF

DISCA: A Digital In-memory Stochastic Computing Architecture Using A Compressed Bent-Pyramid Format

Shady Agwa, Yikang Shen, Shiwei Wang, Themis Prodromakis

Comments This work has been accepted for publication in the 2025 37th International Conference on Microelectronics (ICM)

详情
Journal ref
2025 37th International Conference on Microelectronics (ICM)
英文摘要

Nowadays, we are witnessing an Artificial Intelligence revolution that dominates the technology landscape in various application domains, such as healthcare, robotics, automotive, security, and defense. Massive-scale AI models, which mimic the human brain's functionality, typically feature millions and even billions of parameters through data-intensive matrix multiplication tasks. While conventional Von-Neumann architectures struggle with the memory wall and the end of Moore's Law, these AI applications are migrating rapidly towards the edge, such as in robotics and unmanned aerial vehicles for surveillance, thereby adding more constraints to the hardware budget of AI architectures at the edge. Although in-memory computing has been proposed as a promising solution for the memory wall, both analog and digital in-memory computing architectures suffer from substantial degradation of the proposed benefits due to various design limitations. We propose a new digital in-memory stochastic computing architecture, DISCA, utilizing a compressed version of the quasi-stochastic Bent-Pyramid data format. DISCA inherits the same computational simplicity of analog computing, while preserving the same scalability, productivity, and reliability of digital systems. Post-layout modeling results of DISCA show an energy efficiency of 3.59TOPS/W per bit at 500 MHz using a commercial 180 nm CMOS technology. Therefore, DISCA significantly improves the energy efficiency for matrix multiplication workloads by orders of magnitude if scaled and compared to its counterpart architectures.

2511.17113 2026-04-23 cs.CR cs.AI cs.LG

AutoGraphAD: Unsupervised network anomaly detection using Variational Graph Autoencoders

Georgios Anyfantis, Pere Barlet-Ros

Comments 6 pages, 5 figures

详情
英文摘要

Network Intrusion Detection Systems (NIDS) are essential tools for detecting network attacks and intrusions. While extensive research has explored the use of supervised Machine Learning for attack detection and characterisation, these methods require accurately labelled datasets, which are very costly to obtain. Moreover, existing public datasets have limited and/or outdated attacks, and many of them suffer from mislabelled data. To reduce the reliance on labelled data, we propose AutoGraphAD, a novel unsupervised anomaly detection based on a Heterogeneous Variational Graph Autoencoder. AutoGraphAD operates on heterogeneous graphs, made from connection and IP nodes that represent network activity. The model is trained using unsupervised and contrastive learning, without relying on any labelled data. The model's losses are then weighted and combined in an anomaly score used for anomaly detection. Overall, AutoGraphAD yields the same, and in some cases better, results than Anomal-E, but without requiring costly downstream anomaly detectors. As a result, AutoGraphAD achieves around 1.18 orders of magnitude faster training and 1.03 orders of magnitude faster inference, which represents a significant advantage for operational deployment.

2511.15141 2026-04-23 cs.IR cs.AI

ItemRAG: Item-Based Retrieval-Augmented Generation for LLM-Based Recommendation

Sunwoo Kim, Geon Lee, Kyungho Kim, Jaemin Yoo, Kijung Shin

Comments Published as a conference paper at SIGIR 2026 (short)

详情
英文摘要

Recently, large language models (LLMs) have been widely used as recommender systems, owing to their reasoning capability and effectiveness in handling cold-start items. A common approach prompts an LLM with a target user's purchase history to recommend items from a candidate set, often enhanced with retrieval-augmented generation (RAG). Most existing RAG approaches retrieve purchase histories of users similar to the target user; however, these histories often contain noisy or weakly relevant information and provide little or no useful information for candidate items. To address these limitations, we propose ItemRAG, a novel RAG approach that shifts focus from coarse user-history retrieval to fine-grained item-level retrieval. ItemRAG augments the description of each item in the target user's history or the candidate set by retrieving items relevant to each. To retrieve items not merely semantically similar but informative for recommendation, ItemRAG leverages co-purchase information alongside semantic information. Especially, through their careful combination, ItemRAG prioritizes more informative retrievals and also benefits cold-start items. Through extensive experiments, we demonstrate that ItemRAG consistently outperforms existing RAG approaches under both standard and cold-start item recommendation settings. Supplementary materials, code, and datasets are provided at https://github.com/kswoo97/ItemRAG.

2511.14311 2026-04-23 eess.SY cs.RO cs.SY

Multi-Timescale Model Predictive Control for Slow-Fast Systems

Lukas Schroth, Daniel Morton, Amon Lahr, Daniele Gammelli, Andrea Carron, Marco Pavone

详情
英文摘要

Model Predictive Control (MPC) has established itself as the primary methodology for constrained control, enabling autonomy across diverse applications. While model fidelity is crucial in MPC, solving the corresponding optimization problem in real time remains challenging when combining long horizons with high-fidelity models that capture both short-term dynamics and long-term behavior. Motivated by results on the Exponential Decay of Sensitivities (EDS), which imply that, under certain conditions, the influence of modeling inaccuracies decreases exponentially along the prediction horizon, this paper proposes a multi-timescale MPC scheme for fast-sampled control. Tailored to systems with both fast and slow dynamics, the proposed approach improves computational efficiency by i) switching to a reduced model that captures only the slow, dominant dynamics and ii) exponentially increasing integration step sizes to progressively reduce model detail along the horizon. We evaluate the method on three practically motivated robotic control problems in simulation and observe speed-ups of up to an order of magnitude.

2511.02849 2026-04-23 eess.SP cs.CV eess.IV

Benchmarking ResNet for Short-Term Hypoglycemia Classification with DiaData

Beyza Cinar, Maria Maleshkova

Comments 11 pages, 5 Tables, 4 Figures, BHI 2025 conference (JBHI special issue). References were corrected

详情
英文摘要

Individualized therapy is driven forward by medical data analysis, which provides insight into the patient's context. In particular, for Type 1 Diabetes (T1D), which is an autoimmune disease, relationships between demographics, sensor data, and context can be analyzed. However, outliers, noisy data, and small data volumes cannot provide a reliable analysis. Hence, the research domain requires large volumes of high-quality data. Moreover, missing values can lead to information loss. To address this limitation, this study improves the data quality of DiaData, an integration of 15 separate datasets containing glucose values from 2510 subjects with T1D. Notably, we make the following contributions: 1) Outliers are identified with the interquartile range (IQR) approach and treated by replacing them with missing values. 2) Small gaps ($\le$ 25 min) are imputed with linear interpolation and larger gaps ($\ge$ 30 and $<$ 120 min) with Stineman interpolation. Based on a visual comparison, Stineman interpolation provides more realistic glucose estimates than linear interpolation for larger gaps. 3) After data cleaning, the correlation between glucose and heart rate is analyzed, yielding a moderate relation between 15 and 60 minutes before hypoglycemia ($\le$ 70 mg/dL). 4) Finally, a benchmark for hypoglycemia classification is provided with a state-of-the-art ResNet model. The model is trained with the Maindatabase and Subdatabase II of DiaData to classify hypoglycemia onset up to 2 hours in advance. Training with more data improves performance by 7% while using quality-refined data yields a 2-3% gain compared to raw data.

2510.08465 2026-04-23 stat.ML cs.LG

Accumulated Aggregated D-Optimal Designs for Estimating Main Effects in Black-Box Models

Chih-Yu Chang, Ming-Chung Chang

详情
英文摘要

Estimating how individual input variables affect the output of a black-box model is a central task in explainable machine learning. However, existing methods suffer from two key limitations: sensitivity to out-of-distribution (OOD) evaluations, which arises when query points are placed far from the data manifold, and instability under feature correlation, which can lead to unreliable effect estimates in practice. We introduce a unified view of main effect estimation as a design problem, which reveals that all existing methods differ only in their choice of evaluation locations. Building on this formulation, we propose A2D2E, an Estimator based on Accumulated Aggregated D-Optimal Designs, which replaces evaluations with a D-optimal hypercube design to minimize the variance of main effect estimation. A2D2E is model-agnostic, requires no differentiability of the predictor, and admits a closed-form estimator with complexity comparable to existing approaches. We establish that A2D2E is consistent to the same population target as ALE, and extend this result to the realistic setting where only a surrogate model is available. Through extensive simulations across multiple predictive models and dependence settings, we demonstrate that A2D2E outperforms ALE-based methods, with the largest gains under high feature correlation.

2510.05786 2026-04-23 cs.GT cs.DM cs.LG math.CO

Möbius transforms and Shapley values for vector-valued functions on weighted directed acyclic multigraphs

Patrick Forré, Abel Jansma

Comments 50 pages, 2 figures

详情
英文摘要

Möbius inversion and Shapley values are two mathematical tools for characterizing and decomposing higher-order structure in complex systems. The former defines higher-order interactions as discrete derivatives over a partial order; the latter provides a principled way to attribute those interactions back to the `atomic' elements of the system. Both have found wide application, from combinatorics and cooperative game theory to machine learning and explainable AI. We generalize both tools simultaneously in two orthogonal directions: 1) from real-valued functions to functions valued in any abelian group (in particular, vector-valued functions), and 2) from partial orders and lattices to directed acyclic multigraphs (DAMGs) and weighted versions thereof. The classical axioms, linearity, efficiency, null player, and symmetry, which uniquely characterize Shapley values on lattices, are insufficient in this more general setting. We resolve this by introducing projection operators that recursively re-attribute higher-order synergies down to the roots of the graph, and by proposing two natural axioms: weak elements (coalitions with zero synergy can be removed without affecting any attribution) and flat hierarchy (on graphs with no intermediate hierarchy, attributions are distributed proportionally to edge counts). Together with linearity, these three axioms uniquely determine the Shapley values via a simple explicit formula, while automatically implying efficiency, null player, symmetry, and a novel projection property. The resulting framework recovers all existing lattice-based definitions as special cases, and naturally handles settings, such as games on non-lattice partial orders, which were previously out of reach. The extension to vector-valued functions and general DAMG-structured hierarchies opens new application areas in machine learning, natural language processing, and explainable AI.

2509.19367 2026-04-23 eess.SP cs.LG stat.ML

Low-Cost Sensor Fusion Framework for Organic Substance Classification and Quality Control Using Classification Methods

Borhan Uddin Chowdhury, Damian Valles, Md Raf E Ul Shougat

Comments Copyright 2025 IEEE. This is the author's version of the work accepted for publication in FMLDS 2025. The final version will be published by IEEE and available via DOI (to be inserted when available). Accepted at FMLDS 2025, to appear in IEEE Xplore. 8 pages, 17 figures, 3 tables

详情
Journal ref
2025 IEEE International Conference on Frontiers in Machine Learning and Data Science (FMLDS), IEEE, 2025
英文摘要

We present a sensor-fusion framework for rapid, non-destructive classification and quality control of organic substances, built on a standard Arduino Mega 2560 microcontroller platform equipped with three commercial environmental and gas sensors. All data used in this study were generated in-house: sensor outputs for ten distinct classes - including fresh and expired samples of apple juice, onion, garlic, and ginger, as well as cinnamon and cardamom - were systematically collected and labeled using this hardware setup, resulting in a unique, application-specific dataset. Correlation analysis was employed as part of the preprocessing pipeline for feature selection. After preprocessing and dimensionality reduction (PCA/LDA), multiple supervised learning models - including Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF), each with hyperparameter tuning, as well as an Artificial Neural Network (ANN) and an ensemble voting classifier - were trained and cross-validated on the collected dataset. The best-performing models, including tuned Random Forest, ensemble, and ANN, achieved test accuracies in the 93 to 94 percent range. These results demonstrate that low-cost, multisensory platforms based on the Arduino Mega 2560, combined with advanced machine learning and correlation-driven feature engineering, enable reliable identification and quality control of organic compounds.

2509.16002 2026-04-23 quant-ph cs.LG

Scalable Quantum Reinforcement Learning on NISQ Devices with Dynamic-Circuit Qubit Reuse and Grover Optimization

Thet Htar Su, Shaswot Shresthamali, Masaaki Kondo

详情
英文摘要

A scalable and resource-efficient quantum reinforcement learning framework is presented that eliminates the linear qubit-scaling barrier in multi-step quantum Markov decision processes (QMDPs). The proposed framework integrates a QMDP formulation, dynamic-circuit execution, and Grover-based amplitude amplification into a unified quantum-native architecture. Environment dynamics are encoded entirely within quantum Hilbert space, enabling coherent superposition over state-action sequences and a direct quantum agent-environment interface without intermediate quantum-to-classical conversion. The central contribution is a dynamic execution model for multi-step QMDPs that employs mid-circuit measurement and reset to recycle a fixed physical quantum register across sequential interactions. This approach preserves trajectory fidelity relative to a static unrolled QMDP, generating identical state-action sequences while reducing the physical qubit requirement from 7xT to a constant 7, independent of the interaction horizon T. Thus, the qubit complexity of multi-step QMDPs is transformed from O(T) to O(1) while maintaining functional equivalence at the level of trajectory generation. Trajectory returns are evaluated via quantum arithmetic, and high-return trajectories are marked and amplified using amplitude amplification to increase their sampling probability. Simulations confirm preservation of trajectory fidelity with a 66% qubit reduction compared to a static design. Experimental execution on an IBM Heron-class processor demonstrates feasibility on noisy intermediate-scale quantum hardware, establishing a scalable and resource-efficient foundation for large-scale quantum-native reinforcement learning.

2509.08539 2026-04-23 cs.HC cs.LG

Motion-Based User Identification across XR and Metaverse Applications by Deep Classification and Similarity Learning

Lukas Schach, Christian Rack, Ryan P. McMahan, Marc Erich Latoschik

详情
Journal ref
Frontiers in Virtual Reality, 2026
英文摘要

This paper examines the generalization capacity of two state-of-the-art classification and similarity learning models in reliably identifying users based on their motions in various Extended Reality (XR) applications. We developed a novel dataset containing a wide range of motion data from 49 users in five different XR applications: four XR games with distinct tasks and action patterns, and an additional social XR application with no predefined task sets. The dataset is used to evaluate the performance and, in particular, the generalization capacity of the two models across applications. Our results indicate that while the models can accurately identify individuals within the same application, their ability to identify users across different XR applications remains limited. Overall, our results provide insight into current models generalization capabilities and suitability as biometric methods for user verification and identification. The results also serve as a much-needed risk assessment of hazardous and unwanted user identification in XR and Metaverse applications. Our cross-application XR motion dataset and code are made available to the public to encourage similar research on the generalization of motion-based user identification in typical Metaverse application use cases.

2509.02060 2026-04-23 q-bio.BM cs.LG

Morphology-Aware Peptide Discovery via Masked Conditional Generative Modeling

Nuno Costa, Julija Zavadlav

Comments 46 pages, 4 figures, 6 tables

详情
英文摘要

Peptide self-assembly prediction offers a powerful bottom-up strategy for designing biocompatible, low-toxicity materials for large-scale synthesis in a broad range of biomedical and energy applications. However, screening the vast sequence space for categorization of aggregate morphology remains intractable. We introduce PepMorph, an end-to-end peptide discovery pipeline that generates novel sequences that are not only prone to aggregate but whose self-assembly is steered toward fibrillar or spherical morphologies by conditioning on isolated peptide descriptors that serve as morphology proxies. To this end, we compiled a new dataset by leveraging existing aggregation propensity datasets and extracting geometric and physicochemical descriptors. This dataset is then used to train a Transformer-based Conditional Variational Autoencoder with a masking mechanism, which generates novel peptides under arbitrary conditioning. After filtering to ensure design specifications and validation of generated sequences through coarse-grained molecular dynamics (CG-MD) simulations, PepMorph yielded 83% success rate under our CG-MD validation protocol and morphology criterion for the targeted class, showcasing its promise as a framework for application-driven peptide discovery.

2508.18948 2026-04-23 hep-th cond-mat.dis-nn cs.LG stat.ML

Gauge-covariant stochastic neural fields: Stability and finite-width effects

Rodrigo Carmo Terin

Comments 20 pages, 2 figures, 1 table. Accepted version for publication in Scientific Reports

详情
英文摘要

We develop a gauge-covariant stochastic effective field theory for stability and finite-width effects in deep neural systems. The model uses classical commuting fields: a complex matter field, a real Abelian connection field, and a fictitious stochastic depth variable. Using the Martin--Siggia--Rose--Janssen--de~Dominicis formalism, we derive its functional representation and a two-replica linear-response construction defining the maximal Lyapunov exponent and the amplification factor for the edge of chaos. Finite-width effects appear as perturbative corrections to dressed kernels, and the marginality condition remains unchanged at the order considered for fixed kernel geometry. Numerically, finite-width multilayer perceptrons follow the mean-field instability threshold, and a linear stochastic effective sector reproduces the predicted low-frequency spectral deformation.

2508.15411 2026-04-23 cs.SE cs.CL cs.LG cs.MA

Foundational Design Principles and Patterns for Building Robust and Adaptive GenAI-Native Systems

Frederik Vandeputte

详情
英文摘要

Generative AI (GenAI) has emerged as a transformative technology, demonstrating remarkable capabilities across diverse application domains. However, GenAI faces several major challenges in developing reliable and efficient GenAI-empowered systems due to its unpredictability and inefficiency. This paper advocates for a paradigm shift: future GenAI-native systems should integrate GenAI's cognitive capabilities with traditional software engineering principles to create robust, adaptive, and efficient systems. We introduce foundational GenAI-native design principles centered around five key pillars -- reliability, excellence, evolvability, self-reliance, and assurance -- and propose architectural patterns such as GenAI-native cells, organic substrates, and programmable routers to guide the creation of resilient and self-evolving systems. Additionally, we outline the key ingredients of a GenAI-native software stack and discuss the impact of these systems from technical, user adoption, economic, and legal perspectives, underscoring the need for further validation and experimentation. Our work aims to inspire future research and encourage relevant communities to implement and refine this conceptual framework.

2508.08822 2026-04-23 cs.AR cs.AI cs.ET cs.PF

OISMA: On-the-fly In-memory Stochastic Multiplication Architecture for Matrix-Multiplication Workloads

Shady Agwa, Yihan Pan, Georgios Papandroulidakis, Themis Prodromakis

Comments This work has been accepted for publication by the IEEE Journal on Exploratory Solid-State Computational Devices and Circuits

详情
Journal ref
IEEE Journal on Exploratory Solid-State Computational Devices and Circuits 2026
英文摘要

Artificial intelligence (AI) models are currently driven by a significant upscaling of their complexity, with massive matrix-multiplication workloads representing the major computational bottleneck. In-memory computing (IMC) architectures are proposed to avoid the von Neumann bottleneck. However, both digital/binary-based and analog IMC architectures suffer from various limitations, which significantly degrade the performance and energy efficiency gains. This work proposes OISMA, an energy-efficient IMC architecture that utilizes the computational simplicity of a quasi-stochastic computing (SC) domain (bent-pyramid (BP) system) while keeping the same efficiency, scalability, and productivity of digital memories. OISMA converts normal memory read operations into in situ stochastic multiplication operations with a negligible cost. An accumulation periphery then accumulates the output multiplication bitstreams, achieving the matrix multiplication (MatMul) functionality. A 4-kB 1T1R OISMA array was implemented using a commercial 180-nm technology node and in-house resistive random-access memory (RRAM) technology. At 50 MHz, it achieves 0.789 TOPS/W and 3.98 GOPS/mm2 for energy and area efficiency, respectively, occupying an effective computing area of 0.804241 mm2. Scaling OISMA to 22-nm technology shows a significant improvement of two orders of magnitude in energy efficiency and one order of magnitude in area efficiency, compared to dense MatMul IMC architectures.

2508.07050 2026-04-23 cs.IR cs.AI cs.CL cs.LG

ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability

Wenhan Liu, Xinyu Ma, Weiwei Sun, Yutao Zhu, Yuchen Li, Dawei Yin, Zhicheng Dou

Comments 25 pages, accepted by ACL2026 main conference

详情
英文摘要

Large Language Model (LLM) based listwise ranking has shown superior performance in many passage ranking tasks. With the development of Large Reasoning Models (LRMs), many studies have demonstrated that step-by-step reasoning during test-time helps improve listwise ranking performance. However, due to the scarcity of reasoning-intensive training data, existing rerankers perform poorly in many complex ranking scenarios, and the ranking ability of reasoning-intensive rerankers remains largely underdeveloped. In this paper, we first propose an automated reasoning-intensive training data synthesis framework, which sources training queries and passages from diverse domains and applies DeepSeek-R1 to generate high-quality training labels. To empower the listwise reranker with strong reasoning ability, we further propose a two-stage training approach, which includes a cold-start supervised fine-tuning (SFT) stage and a reinforcement learning (RL) stage. During the RL stage, we design a novel multi-view ranking reward tailored to the multi-turn nature of listwise ranking. Extensive experiments demonstrate that our trained reasoning-intensive reranker \textbf{ReasonRank} outperforms existing baselines significantly and also achieves much lower latency than the pointwise reranker. Our codes are available at https://github.com/8421BCD/ReasonRank.

2507.16433 2026-04-23 stat.ME cs.LG

Adaptive Multi-task Learning for Multi-sector Portfolio Optimization

Qingliang Fan, Ruike Wu, Yanrong Yang

详情
英文摘要

Accurate transfer of information across multiple sectors to enhance model estimation is both significant and challenging in multi-sector portfolio optimization involving a large number of assets in different classes. Within the framework of factor modeling, we propose a novel data-adaptive multi-task learning methodology that quantifies and learns the relatedness among the principal temporal subspaces (spanned by factors) across multiple sectors under study. This approach not only improves the simultaneous estimation of multiple factor models but also enhances multi-sector portfolio optimization, which heavily depends on the accurate recovery of these factor models. Additionally, a novel and easy-to-implement algorithm, termed projection-penalized principal component analysis, is developed to accomplish the multi-task learning procedure. Diverse simulation designs and practical application on daily return data from Russell 3000 index demonstrate the advantages of multi-task learning methodology.

2507.08540 2026-04-23 cs.CR cs.AI

White-Basilisk: A Hybrid Model for Code Vulnerability Detection

Ioannis Lamprou, Alexander Shevtsov, Ioannis Arapakis, Sotiris Ioannidis

详情
英文摘要

The proliferation of software vulnerabilities presents a significant challenge to cybersecurity, necessitating more effective detection methodologies. We introduce White-Basilisk, a novel approach to vulnerability detection that demonstrates superior performance while challenging prevailing assumptions in AI model scaling. Utilizing an innovative architecture that integrates Mamba layers, linear self-attention, and a Mixture of Experts framework, White-Basilisk achieves state-of-the-art results in vulnerability detection tasks with a parameter count of only 200M. The model's capacity to process sequences of unprecedented length enables comprehensive analysis of extensive codebases in a single pass, surpassing the context limitations of current Large Language Models (LLMs). White-Basilisk exhibits robust performance on imbalanced, real-world datasets, while maintaining computational efficiency that facilitates deployment across diverse organizational scales. This research not only establishes new benchmarks in code security but also provides empirical evidence that compact, efficiently designed models can outperform larger counterparts in specialized tasks, potentially redefining optimization strategies in AI development for domain-specific applications.

2507.07800 2026-04-23 q-bio.QM cs.CV

A novel attention mechanism for noise-adaptive and robust segmentation of microtubules in microscopy images

Achraf Ait Laydi, Louis Cueff, Mewen Crespo, Yousef El Mourabit, Hélène Bouvrais

详情
英文摘要

Segmenting cytoskeletal filaments in microscopy images is essential for studying their roles in cellular processes. However, this task is highly challenging due to the fine, densely packed, and intertwined nature of these structures. Imaging limitations further complicate analysis. While deep learning has advanced segmentation of large, well-defined biological structures, its performance often degrades under such adverse conditions. Additional challenges include obtaining precise annotations for curvilinear structures and managing severe class imbalance during training. We introduce a novel noise-adaptive attention mechanism that extends the Squeeze-and-Excitation (SE) module to dynamically adjust to varying noise levels. Integrated into a U-Net decoder with residual encoder blocks, this yields ASE_Res_UNet, a lightweight yet high-performance model. We also developed a synthetic dataset generation strategy that ensures accurate annotations of fine filaments in noisy images. We systematically evaluated loss functions and metrics to mitigate class imbalance, ensuring robust performance assessment. ASE_Res_UNet effectively segmented microtubules in noisy synthetic images, outperforming its ablated variants. It also demonstrated superior segmentation compared to models with alternative attention mechanisms or distinct architectures, while requiring fewer parameters, making it efficient for resource-constrained environments. Evaluation on a newly curated real microscopy dataset and a recently reannotated dataset highlighted ASE_Res_UNet's effectiveness in segmenting microtubules beyond synthetic images. For these datasets, ASE_Res_UNet was competitive with a recent synthetic data-driven approach that shares two cytoskeleton pretrained models. Importantly, ASE_Res_UNet showed strong transferability to other curvilinear structures (blood vessels and nerves) across diverse imaging conditions.

2506.20910 2026-04-23 math.OC cs.LG stat.ML

Faster Fixed-Point Methods for Multichain MDPs

Matthew Zurek, Yudong Chen

详情
Journal ref
NeurIPS 2025
英文摘要

We study value-iteration (VI) algorithms for solving general (a.k.a. multichain) Markov decision processes (MDPs) under the average-reward criterion, a fundamental but theoretically challenging setting. Beyond the difficulties inherent to all average-reward problems posed by the lack of contractivity and non-uniqueness of solutions to the Bellman operator, in the multichain setting an optimal policy must solve the navigation subproblem of steering towards the best connected component, in addition to optimizing long-run performance within each component. We develop algorithms which better solve this navigational subproblem in order to achieve faster convergence for multichain MDPs, obtaining improved rates of convergence and sharper measures of complexity relative to prior work. Many key components of our results are of potential independent interest, including novel connections between average-reward and discounted problems, optimal fixed-point methods for discounted VI which extend to general Banach spaces, new sublinear convergence rates for the discounted value error, and refined suboptimality decompositions for multichain MDPs. Overall our results yield faster convergence rates for discounted and average-reward problems and expand the theoretical foundations of VI approaches.

2506.16658 2026-04-23 math.ST cs.LG stat.ML stat.TH

Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards

Wenlong Ji, Yihan Pan, Ruihao Zhu, Lihua Lei

详情
英文摘要

Multi-armed bandit (MAB) is a widely adopted framework for sequential decision-making under uncertainty. Traditional bandit algorithms rely solely on online data, which tends to be scarce as it must be gathered during the online phase when the arms are actively pulled. However, in many practical settings, rich auxiliary data, such as covariates of past users, is available prior to deploying any arms. We introduce a new setting for MAB where pre-trained machine learning (ML) models are applied to convert side information and historical data into \emph{surrogate rewards}. A prominent challenge of this setting is that the surrogate rewards may exhibit substantial bias, as true reward data is typically unavailable in the offline phase, forcing ML predictions to heavily rely on extrapolation. To address the issue, we propose the Machine Learning-Assisted Upper Confidence Bound (MLA-UCB) algorithm, which can be applied to any reward prediction model and any form of auxiliary data. When the predicted and true rewards are jointly Gaussian, it provably improves the cumulative regret, even in cases where the mean surrogate reward completely misaligns with the true mean rewards, and achieves the asymptotic optimality among a broad class of policies. Notably, our method requires no prior knowledge of the covariance matrix between true and surrogate rewards. We further extend the method to a batched reward MAB problem, where each arm pull yields a batch of observations and rewards may be non-Gaussian, and we derive computable confidence bounds and regret guarantees that improve upon classical UCB algorithms. Finally, extensive simulations with both Gaussian and ML-generated surrogates, together with real-world studies on language model selection and video recommendation, demonstrate consistent and often substantial regret reductions with moderate offline surrogate sample sizes and correlations.

2505.07849 2026-04-23 cs.SE cs.AI cs.IR

SweRank: Software Issue Localization with Code Ranking

Revanth Gangi Reddy, Tarun Suresh, JaeHyeok Doo, Ye Liu, Xuan Phi Nguyen, Yingbo Zhou, Semih Yavuz, Caiming Xiong, Heng Ji, Shafiq Joty

Comments ICLR 2026 Camera Ready Version

详情
英文摘要

Software issue localization, the task of identifying the precise code locations (files, classes, or functions) relevant to a natural language issue description (e.g., bug report, feature request), is a critical yet time-consuming aspect of software development. While recent LLM-based agentic approaches demonstrate promise, they often incur significant latency and cost due to complex multi-step reasoning and relying on closed-source LLMs. Alternatively, traditional code ranking models, typically optimized for query-to-code or code-to-code retrieval, struggle with the verbose and failure-descriptive nature of issue localization queries. To bridge this gap, we introduce SweRank, an efficient and effective retrieve-and-rerank framework for software issue localization. To facilitate training, we construct SweLoc, a large-scale dataset curated from public GitHub repositories, featuring real-world issue descriptions paired with corresponding code modifications. Empirical results on SWE-Bench-Lite and LocBench show that SweRank achieves state-of-the-art performance, outperforming both prior ranking models and costly agent-based systems using closed-source LLMs like Claude-3.5. Further, we demonstrate SweLoc's utility in enhancing various existing retriever and reranker models for issue localization, establishing the dataset as a valuable resource for the community.

2504.19239 2026-04-23 quant-ph cs.LG

The effect of the number of parameters and the number of local feature patches on loss landscapes in distributed quantum neural networks

Yoshiaki Kawase

Comments 15 pages + Appendices

详情
英文摘要

Quantum neural networks hold promise for tackling computationally challenging tasks that are intractable for classical computers. However, their practical application is hindered by significant optimization challenges, arising from complex loss landscapes characterized by barren plateaus and numerous local minima. These problems become more severe as the number of parameters or qubits increases, hampering effective training. To mitigate these optimization challenges, particularly for classical data, we distribute overlapping local patches across multiple quantum neural networks, processing each patch with an independent quantum neural network, and aggregating their outputs for prediction. In this study, we investigate how the number of parameters and patches affects the loss landscape geometry of this distributed quantum neural network architecture via theoretical and empirical Hessian analyses and loss landscape visualization. Our results confirm that increasing the number of parameters tends to lead to deeper and sharper loss landscapes. Crucially, we theoretically derive and empirically demonstrate that increasing the number of patches significantly reduces the largest Hessian eigenvalue at minima. Furthermore, our analysis of the full Hessian eigenspectrum reveals a structure consisting of a bulk of near-zero eigenvalues and distinct outlier spikes corresponding to the number of classes, similar to classical deep learning models. These findings suggest that our distributed patch approach acts as a form of implicit structural regularization, promoting optimization stability and potentially enhancing generalization. Our study provides valuable insights into optimization challenges and highlights that the distributed patch approach is a promising strategy for developing more trainable and scalable quantum machine learning models for classical data tasks.

2503.23729 2026-04-23 math.NA cs.LG cs.NA

Integral regularization PINNs for evolution equations

Xiaodong Feng, Haojiong Shangguan, Tao Tang, Xiaoliang Wan

详情
英文摘要

Evolution equations, including both ordinary differential equations (ODEs) and partial differential equations (PDEs), play a pivotal role in modeling dynamic systems. However, achieving accurate long-time integration for these equations remains a significant challenge. While physics-informed neural networks (PINNs) provide a mesh-free framework for solving PDEs, they often suffer from temporal error accumulation, which limits their effectiveness in capturing long-time behaviors. To alleviate this issue, we propose integral regularization PINNs (IR-PINNs), a novel approach that enhances temporal accuracy by incorporating an integral-based residual term into the loss function. This method divides the entire time interval into smaller sub-intervals and enforces constraints over these sub-intervals, thereby improving the resolution and correlation of temporal dynamics. Furthermore, IR-PINNs leverage adaptive sampling to dynamically refine the distribution of collocation points based on the evolving solution, ensuring higher accuracy in regions with sharp gradients or rapid variations. Numerical experiments on benchmark problems demonstrate that IR-PINNs outperform original PINNs and other state-of-the-art methods in capturing long-time behaviors, offering a robust and accurate solution for evolution equations.

2503.03816 2026-04-23 astro-ph.GA cs.LG

The Optical and Infrared Are Connected

Christian K. Jespersen, Peter Melchior, David N. Spergel, Andy D. Goulding, ChangHoon Hahn, Kartheik G. Iyer

Comments Accepted to ApJ. 18 pages, 14 figures. 11 pages of Appendix

详情
英文摘要

Galaxies are often modelled as composites of separable components with distinct spectral signatures, implying that different wavelength ranges are only weakly correlated. They are not. We present a data-driven model which exploits subtle correlations between physical processes to accurately predict infrared (IR) WISE photometry from a neural summary of optical SDSS spectra. The model achieves accuracies of $χ^2_N \approx 1$ for all photometric bands in WISE, as well as good colors. We are able to tightly constrain typically IR-derived properties, e.g., the bolometric luminosities of AGN and dust parameters such as $\mathrm{q_{PAH}}$. We also test whether current SED-fitting methods reproduce such panchromatic relations, but find their predictions biased and overconfident, likely due to model misspecification, with correlated biases in star-formation rates and AGN luminosities being most evident. To help improve SED models, we determine which features of the optical spectrum are responsible for our improved predictions, and identify several lines (CaII, SrII, FeI, [OII] and H$α$), which point to the complex chronology of star formation and chemical enrichment being incorrectly modelled.

2501.03624 2026-04-23 cs.HC cs.CL

LLAMADRS: Evaluating Open-Source LLMs on Real Clinical Interviews--To Reason or Not to Reason?

Gaoussou Youssouf Kebe, Jeffrey M. Girard, Einat Liebenthal, Justin Baker, Fernando De la Torre, Louis-Philippe Morency

详情
英文摘要

Large language models (LLMs) excel on many NLP benchmarks, but their behavior on real-world, semi-structured prediction remains underexplored. We present LlaMADRS, a benchmark for structured clinical assessment from dialogue built on the CAMI corpus of psychiatric interviews, comprising 5,804 expert annotations across 541 sessions. We evaluate 25 open-source models (standard and reasoning-augmented; 0.6B--400B parameters) and generate over 400,000 predictions. Our results demonstrate that strong open-source LLMs achieve item-level accuracy with residual error below clinically substantial thresholds. Additionally, an Item-then-Sum (ItS) strategy, assessing symptoms individually through discrete LLM calls before synthesizing final scores, significantly reduces error relative to Direct Total Score (DTS) prediction across most model architectures and scales, despite reasoning models attempting similar decomposition in the reasoning traces of their DTS predictions. In fact, we find that performance gains attributed to "reasoning" depend fundamentally on prompt design: standard models equipped with structured task definitions and examples match reasoning-augmented counterparts. Among the latter, longer reasoning traces correlate with reduced error; while higher model scale does across both architectures. Our results clarify when and why reasoning helps and offer actionable guidance for deploying LLMs in semi-structured clinical assessment.

2412.18208 2026-04-23 quant-ph cs.LG

Quantum framework for Reinforcement Learning: Integrating Markov decision process, quantum arithmetic, and trajectory search

Thet Htar Su, Shaswot Shresthamali, Masaaki Kondo

详情
Journal ref
Physical Review A (2025)
英文摘要

This paper introduces a quantum framework for addressing reinforcement learning (RL) tasks, grounded in the quantum principles and leveraging a fully quantum model of the classical Markov decision process (MDP). By employing quantum concepts and a quantum search algorithm, this work presents the implementation and optimization of the agent-environment interactions entirely within the quantum domain, eliminating reliance on classical computations. Key contributions include the quantum-based state transitions, return calculation, and trajectory search mechanism that utilize quantum principles to demonstrate the realization of RL processes through quantum phenomena. The implementation emphasizes the fundamental role of quantum superposition in enhancing computational efficiency for RL tasks. Results demonstrate the capacity of a quantum model to achieve quantum enhancement in RL, highlighting the potential of fully quantum implementations in decision-making tasks. This work not only underscores the applicability of quantum computing in machine learning but also contributes to the field of quantum reinforcement learning (QRL) by offering a robust framework for understanding and exploiting quantum computing in RL systems.

2411.00585 2026-04-23 cs.CY cs.AI

Fairness Testing of Large Language Models in Role-Playing

Xinyue Li, Zhenpeng Chen, Jie M. Zhang, Ying Xiao, Tianlin Li, Weisong Sun, Yang Liu, Yiling Lou, Xuanzhe Liu

Comments Accepted by ACM International Conference on the Foundations of Software Engineering (FSE 2026)

详情
英文摘要

Large Language Models (LLMs) have become foundational in modern language-driven software applications, profoundly influencing daily life. A critical technique in leveraging their potential is role-playing, where LLMs simulate diverse roles to enhance their real-world utility. However, while research has highlighted the presence of social biases in LLM outputs, it remains unclear whether and to what extent these biases emerge during role-playing scenarios. In this paper, we conduct an empirical study on fairness testing of LLMs in role-playing scenarios. To enable this testing, we use LLMs to generate 550 social roles spanning a comprehensive set of 11 demographic attributes, producing 33,000 role-specific questions that target various forms of bias. These questions, covering Yes/No, multiple-choice, and open-ended formats, are designed to prompt LLMs to adopt specific roles and respond accordingly. We employ a combination of rule-based and LLM-based strategies to identify biased responses, rigorously validated through human evaluation. Using the generated questions as the test cases, we conduct extensive evaluations of 10 advanced LLMs. The evaluation reveal 107,580 biased responses across the studied LLMs, with individual models yielding between 7,579 and 16,963 biased responses, underscoring the prevalence of bias in role-playing contexts. To support future research, we have publicly released the dataset, along with all scripts and experimental results.