arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.20135 2026-04-23 cs.CL cs.IR

AFMRL: Attribute-Enhanced Fine-Grained Multi-Modal Representation Learning in E-commerce

Biao Zhang, Lixin Chen, Bin Zhang, Zongwei Wang, Tong Liu, Bo Zheng

Comments Accepted by ACL 2026

详情

英文摘要

Multimodal representation is crucial for E-commerce tasks such as identical product retrieval. Large representation models (e.g., VLM2Vec) demonstrate strong multimodal understanding capabilities, yet they struggle with fine-grained semantic comprehension, which is essential for distinguishing highly similar items. To address this, we propose Attribute-Enhanced Fine-Grained Multi-Modal Representation Learning (AFMRL), which defines product fine-grained understanding as an attribute generation task. It leverages the generative power of Multimodal Large Language Models (MLLMs) to extract key attributes from product images and text, and enhances representation learning through a two-stage training framework: 1) Attribute-Guided Contrastive Learning (AGCL), where the key attributes generated by the MLLM are used in the image-text contrastive learning training process to identify hard samples and filter out noisy false negatives. 2) Retrieval-aware Attribute Reinforcement (RAR), where the improved retrieval performance of the representation model post-attribute integration serves as a reward signal to enhance MLLM's attribute generation during multimodal fine-tuning. Extensive experiments on large-scale E-commerce datasets demonstrate that our method achieves state-of-the-art performance on multiple downstream retrieval tasks, validating the effectiveness of harnessing generative models to advance fine-grained representation learning.

URL PDF HTML ☆

赞 0 踩 0

2604.20131 2026-04-23 cs.CL

Whose Story Gets Told? Positionality and Bias in LLM Summaries of Life Narratives

Melanie Subbiah, Haaris Mian, Nicholas Deas, Ananya Mayukha, Dan P. McAdams, Kathleen McKeown

2604.20130 2026-04-23 cs.LG cs.CV

Pairing Regularization for Mitigating Many-to-One Collapse in GANs

Kuan-Yu Lin, Yu-Chih Huang, Tie Liu

2604.20129 2026-04-23 cs.LG cs.DC cs.PF cs.SE

A Delta-Aware Orchestration Framework for Scalable Multi-Agent Edge Computing

Samaresh Kumar Singh, Joyjit Roy

Comments 11 pages, 2 figures, 10 tables

详情

英文摘要

The Synergistic Collapse occurs when scaling beyond 100 agents causes superlinear performance degradation that individual optimizations cannot prevent. We observe this collapse with 150 cameras in Smart City deployment using MADDPG, where Deadline Satisfaction drops from 78% to 34%, producing approximately $180,000 in annual cost overruns. Prior work has addressed each contributing factor in isolation: exponential action-space growth, computational redundancy among spatially adjacent agents, and task-agnostic hardware scheduling. None has examined how these three factors interact and amplify each other. We present DAOEF (Delta-Aware Orchestration for Edge Federations), a framework that addresses all three simultaneously through: (1) Differential Neural Caching, which stores intermediate layer activations and computes only the input deltas, achieving 2.1x higher hit ratios (72% vs. 35%) than output-level caching while staying within 2% accuracy loss through empirically calibrated similarity thresholds; (2) Criticality-Based Action Space Pruning, which organizes agents into priority tiers and reduces coordination complexity from O(n2) to O(n log n) with less than 6% optimality loss; and (3) Learned Hardware Affinity Matching, which assigns tasks to their optimal accelerator (GPU, CPU, NPU, or FPGA) to prevent compounding mismatch penalties. Controlled factor-isolation experiments confirm that each mechanism is necessary but insufficient on its own: removing any single mechanism increases latency by more than 40%, validating that the gains are interdependent rather than additive. Across four datasets (100-250 agents) and a 20-device physical testbed, DAOEF achieves a 1.45x multiplicative gain over applying the three mechanisms independently. A 200-agent cloud deployment yields 62% latency reduction (280 ms vs. 735 ms), sub-linear latency growth up to 250 agents.

URL PDF HTML ☆

赞 0 踩 0

2604.20128 2026-04-23 cs.CV

Semi-Supervised Flow Matching for Mosaiced and Panchromatic Fusion Imaging

Peiming Luo, Nan Wang, Litong Liu, Jiahan Huang, Chenxu Wu, Renwei Dian, Junming Hou

2604.20122 2026-04-23 cs.LG cs.AI

Adaptive Conformal Anomaly Detection with Time Series Foundation Models for Signal Monitoring

Natalia Martinez Gil, Fearghal O'Donncha, Wesley M. Gifford, Nianjun Zhou, Dhaval C. Patel, Roman Vaculin

Comments Code in : https://github.com/ibm-granite/granite-tsfm/tree/main/notebooks/hfdemo/adaptive_conformal_tsad

2604.20117 2026-04-23 cs.CL

To Know is to Construct: Schema-Constrained Generation for Agent Memory

Lei Zheng, Weinan Song, Daili Li, Yanming Yang

2604.20116 2026-04-23 cs.SD

Before the Mic: Physical-Layer Voiceprint Anonymization with Acoustic Metamaterials

Zhiyuan Ning, Zhanyong Tang, Xiaojiang Chen, Zheng Wang

2604.20115 2026-04-23 cs.LG cs.AI stat.ML

On the Stability and Generalization of First-order Bilevel Minimax Optimization

Xuelin Zhang, Peipei Yuan

2604.20111 2026-04-23 cs.LG cs.AI stat.ML

Meta Additive Model: Interpretable Sparse Learning With Auto Weighting

Xuelin Zhang, Xinyue Liu, Lingjuan Wu, Hong Chen

2604.20109 2026-04-23 cs.LG cs.AI math.OC

Learning to Solve the Quadratic Assignment Problem with Warm-Started MCMC Finetuning

Yicheng Pan, Ruisong Zhou, Haijun Zou, Tianyou Li, Zaiwen Wen

2604.20098 2026-04-23 cs.LG

Differentiable Conformal Training for LLM Reasoning Factuality

Nathan Hittesdorf, Marco Salzetta, Lu Cheng

Comments Submitted ICML

2604.20093 2026-04-23 cs.CV

FurnSet: Exploiting Repeats for 3D Scene Reconstruction

Paul Dobre, Xin Wang, Hongzhou Yang

2604.20090 2026-04-23 cs.CL

Less Languages, Less Tokens: An Efficient Unified Logic Cross-lingual Chain-of-Thought Reasoning Framework

Chenyuan Zhang, Qiguang Chen, Xie Chen, Zhuotao Tian, Bowen Xing, Meishan Zhang, Libo Qin, Baotian Hu, Min Zhang

Comments Accepted by ACL2026 Main

2604.20087 2026-04-23 cs.CL cs.LG

SkillLearnBench: Benchmarking Continual Learning Methods for Agent Skill Generation on Real-World Tasks

Shanshan Zhong, Yi Lu, Jingjie Ning, Yibing Wan, Lihan Feng, Yuyi Ao, Leonardo F. R. Ribeiro, Markus Dreyer, Sean Ammirati, Chenyan Xiong

2604.20083 2026-04-23 cs.LG cs.CV

Energy-Based Open-Set Active Learning for Object Classification

Zongyao Lyu, William J. Beksi

Comments To be published in the 2026 International Conference on Pattern Recognition (ICPR)

2604.20082 2026-04-23 cs.LG

Concept Graph Convolutions: Message Passing in the Concept Space

Lucie Charlotte Magister, Pietro Lio

2604.20079 2026-04-23 cs.LG cs.CL

On the Quantization Robustness of Diffusion Language Models in Coding Benchmarks

Aarav Gupta, Gururaj Deshpande, Chandreyi Chakraborty

2604.20078 2026-04-23 cs.LG

Improved large-scale graph learning through ridge spectral sparsification

Daniele Calandriello, Ioannis Koutis, Alessandro Lazaric, Michal Valko

Comments International Conference on Machine Learning (ICML 2018)

2604.20077 2026-04-23 cs.LG

Analysis of Nystrom method with sequential ridge leverage scores

Daniele Calandriello, Alessandro Lazaric, Michal Valko

Comments Uncertainty in Artificial Intelligence (UAI 2016)

2604.20074 2026-04-23 cs.LG

Maximum Entropy Semi-Supervised Inverse Reinforcement Learning

Julien Audiffren, Michal Valko, Alessandro Lazaric, Mohammad Ghavamzadeh

Comments In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI 2015)

2604.20062 2026-04-23 cs.LG cs.CR cs.DC

Federated Learning over Blockchain-Enabled Cloud Infrastructure

Saloni Garg, Amit Sagtani, Kamal Kant Hiran

Comments 7 pages, 5 figures, 2 tables

详情

DOI: 10.1109/ICTBIG68706.2025.11323669
Journal ref: in 2025 IEEE 5th International Conference on ICT in Business Industry & Government (ICTBIG), Indore, India, Dec. 2025, pp. 1-7

英文摘要

The rise of IoT devices and the uptake of cloud computing have informed a new era of data-driven intelligence. Traditional centralized machine learning models that require a large volume of data to be stored in a single location have therefore become more susceptible to data breaches, privacy violations, and regulatory non-compliance. This report presents a thorough examination of the merging of Federated Learning (FL) and blockchain technology in a cloud-edge setting, demonstrating it as an effective solution to the stated concerns. We are proposing a detailed four-dimensional architectural categorization that meticulously assesses coordination frameworks, consensus algorithms, data storage practices, and trust models that are significant to these integrated systems. The manuscript presents a comprehensive comparative examination of two cutting-edge frameworks: the Multi-Objectives Reinforcement Federated Learning Blockchain (MORFLB), which is designed for intelligent transportation systems, and the Federated Blockchain-IoT Framework for Sustainable Healthcare Systems (FBCI-SHS), elucidating their distinctive contributions and inherent limitations. Lastly, we engage in a thorough evaluation of the literature that integrates a comparative perspective on current frameworks to discern the singular nature of this research within existing knowledge systems. The manuscript culminates in delineating the principal challenges and offering a strategic framework for prospective research trajectories, emphasizing the advancement of adaptive, resilient, and standardized BCFL systems across diverse application domains.

URL PDF HTML ☆

赞 0 踩 0

2604.20055 2026-04-23 cs.AI cs.HC

From Fuzzy to Formal: Scaling Hospital Quality Improvement with AI

Patrick Vossler, Jean Feng, Venkat Sivaraman, Robert Gallo, Hemal Kanzaria, Dana Freiser, Christopher Ross, Amy Ou, James Marks, Susan Ehrlich, Christopher Peabody, Lucas Zier

Comments 34 pages, 8 figures, 6 tables

详情

英文摘要

Hospital Quality Improvement (QI) plays a critical role in optimizing healthcare delivery by translating high-level hospital goals into actionable solutions. A critical step of QI is to identify the key modifiable contributing factors, a process we call QI factor discovery, typically through expert-driven semi-structured qualitative tools like fishbone diagrams, chart reviews, and Lean Healthcare methods. AI has the potential to transform and accelerate QI factor discovery, which is traditionally time- and resource-intensive and limited in reproducibility and auditability. Nevertheless, current AI alignment methods assume the task is well-defined, whereas QI factor discovery is an exploratory, fuzzy, and iterative sense-making process that relies on complex implicit expert judgments. To design an AI pipeline that formalizes the QI process while preserving its exploratory components, we propose viewing the task as learning not only LLM prompts but also the overarching natural-language specifications. In particular, we map QI factor discovery to steps of the classical AI/ML development process (problem formalization, model learning, and model validation) where the specifications are tunable hyperparameters. Domain experts and AI agents iteratively refine both the overarching specifications and AI pipeline until AI extractions are concordant with expert annotations and aligned with clinical objectives. We applied this "Human-AI Spec-Solution Co-optimization" framework at an urban safety-net hospital to identify factors driving prolonged length of stay and unplanned 30-day readmissions. The resulting AI-for-QI pipelines achieved $\ge 70\%$ concordance with expert annotations. Compared to prior manual Lean analyses, the AI pipeline was substantially more efficient, recovered previous findings, surfaced new modifiable factors, and produced auditable reasoning traces.

URL PDF HTML ☆

赞 0 踩 0

2604.20047 2026-04-23 cs.CV cs.CR

PASTA: A Patch-Agnostic Twofold-Stealthy Backdoor Attack on Vision Transformers

Dazhuang Liu, Yanqi Qiao, Rui Wang, Kaitai Liang, Georgios Smaragdakis

2604.20043 2026-04-23 cs.CL cs.AI

TriEx: A Game-based Tri-View Framework for Explaining Internal Reasoning in Multi-Agent LLMs

Ziyi Wang, Chen Zhang, Wenjun Peng, Qi Wu, Xinyu Wang

Comments ACL2026 Main

2604.20041 2026-04-23 cs.CV cs.AI

Normalizing Flows with Iterative Denoising

Tianrong Chen, Jiatao Gu, David Berthelot, Joshua Susskind, Shuangfei Zhai

2604.20039 2026-04-23 cs.AI cs.LG

Separable Pathways for Causal Reasoning: How Architectural Scaffolding Enables Hypothesis-Space Restructuring in LLM Agents

John Alderete, Sebastian Benthal, Connie Xu, John Xing

Comments 24 pages, 11 tables, 2 figures

2604.20038 2026-04-23 cs.CV

FluSplat: Sparse-View 3D Editing without Test-Time Optimization

Haitao Huang, Shin-Fang Chng, Huangying Zhan, Qingan Yan, Yi Xu

2604.20030 2026-04-23 cs.CV

Learning to count small and clustered objects with application to bacterial colonies

Minghua Zheng, Na Helian, Peter C. R. Lane, Yi Sun, Allen Donald

Comments 59 pages, 26 figures

2604.20027 2026-04-23 cs.CV cs.AI

Cognitive Alignment At No Cost: Inducing Human Attention Biases For Interpretable Vision Transformers

Ethan Knights