arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.01336 2026-05-05 cs.CL

A Multi-View Media Profiling Suite: Resources, Evaluation, and Analysis

Muhammad Arslan Manzoor, Dilshod Azizov, Daniil Orel, Umer Siddique, Zain Muhammad Mujahid, Yufang Hou, Preslav Nakov

详情

英文摘要

News outlets shape public opinion at a scale that makes automated detection of political bias and factuality essential. However, the field still lacks unified resources, comprehensive evaluations across diverse approaches, and systematic analyses of the representations and fusion strategies that matter most, especially under label sparsity and dataset diversity. In addition, there is little empirical work reporting broad, observation-driven findings about what consistently works, what fails, and why. We address these gaps through four main contributions. First, we introduce MBFC-2025, a large-scale label set covering approximately 2,600 outlets from Media Bias/Fact Check (MBFC). Second, we construct multiview representations for ACL-2020 (Panayotov et al., 2022), which includes around 900 outlets, as well as for MBFC-2025. These representations span Alexa graphs, hyperlink graphs, LLM-derived graphs, articles, and Wikipedia descriptions. Third, we provide a systematic evaluation and analysis of embedding views and fusion strategies, including a reinforcement learning-based fusion variant. Fourth, we conduct extensive experiments that achieve state-of-the-art results on ACL-2020 and establish strong benchmarks on MBFC-2025.

URL PDF HTML ☆

赞 0 踩 0

2605.01331 2026-05-05 cs.CV

Zero-Shot Interpretable Image Steganalysis for Invertible Image Hiding

Hao Wang, Yiming Yao, Yaguang Xie, Tong Qiao, Zhidong Zhao

Comments Accepted by IEEE SPL

2605.01330 2026-05-05 cs.CV

Colinearity Decay: Training Quantization-Friendly ViTs with Outlier Decay

Jin Tong, Guang Liang, Peilin Sun, Jianxin Wu

Comments 17 pages, 5 figures

2605.01329 2026-05-05 cs.AI cs.CY

Truth or Tribe: How In-group Favoritism Prioritize Facts in Persona Agents

Shijun Lei, Hongyu Wang, Yunji Liang, Haowen Zheng, Bin Guo, Zhiwen Yu

Comments 21 pages. Under review

2605.01325 2026-05-05 cs.CV cs.LG

Rethinking Model Selection in VLM Through the Lens of Gromov-Wasserstein Distance

Muyang Li, Yucheng Liu, Jianbo Ma, Elliot Osborne, Bo Han, Tongliang Liu

Comments Accepted as Highlight publication for CVPR 2026

2605.01322 2026-05-05 cs.CL

Benchmarking LightGBM and BiLSTM for Sentiment Analysis on Indonesian E-Commerce Reviews

Lidia Natasyah Marpaung, Vania Claresta, Iqfina Haula Halika, Luluk Muthoharoh, Ardika Satria, Martin Clinton Tosima Manullang

Comments 9 pages, 3 figures, 2 tables. The paper compares LightGBM, Logistic Regression, and linear SVM against a BiLSTM model for Indonesian e-commerce sentiment analysis using a 15,000-sample dataset from Hugging Face

2605.01317 2026-05-05 cs.CL

Sentiment Analysis of Mobile Legends App Reviews Using Machine Learning and LSTM-Based Deep Learning Models

Vira Putri Maharani, Kharisa Harvanny, Daris Samudra, Luluk Muthoharoh, Ardika Satria, Martin Clinton Tosima Manullang

Comments 8 pages, 3 figures, includes comparative evaluation of Machine Learning and LSTM models for sentiment analysis on Indonesian Mobile Legends app reviews, with dataset description, methodology, model architecture, results, discussion, acknowledgments, and references

2605.01315 2026-05-05 cs.CL

Enhancing Game Review Sentiment Classification on Steam Platform with Attention-Based BiLSTM

Abit Ahmad Oktarian, Fadhil Fitra Wijaya, Dhafin Razaqa Luthfi, Luluk Muthoharoh, Ardika Satria, Martin Clinton Tosima Manullang

Comments 7 pages, 4 figures, and 2 tables. The paper is a research manuscript on sentiment analysis of Steam game reviews, comparing TF-IDF-based machine learning methods with a BiLSTM+Attention deep learning model

2605.01311 2026-05-05 cs.LG econ.EM stat.AP stat.ML

The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice

Jikai Jin, Vasilis Syrgkanis

2605.01310 2026-05-05 cs.LG cs.AI

GraphSculptor: Sculpting Pre-training Coreset for Graph Self-supervised Learning

Chuang Liu, Zelin Yao, Xueqi Ma, Luzhi Wang, Mukun Chen, Pinghua Xu, Wenbin Hu

Comments 9 pages, 5 figures, Accepted by IJCAI 2026

2605.01309 2026-05-05 cs.CV

CUE: Concept-Aware Multi-Label Expansion to Mitigate Concept Confusion in Long-Tailed Learning

Ruichi Zhang, Chikai Shang, Jiacheng Yang, Mengke Li, Yang Zhou, Junlong Gao, Yang Lu

Comments 10 pages. Accepted by CVPR 2026

2605.01302 2026-05-05 cs.CL cs.IR

Beyond Semantic Relevance: Counterfactual Risk Minimization for Robust Retrieval-Augmented Generation

Peiyang Liu, Qiang Yan, Ziqiang Cui, Di Liang, Xi Wang, Wei Ye

2605.01299 2026-05-05 cs.LG

GA-VisAgent: A Multi-Agent application for code generation and visualization in interactive learning

Wang Jian, Zhou Jianbo, Xiong Yuhao, Liu Zhenxia, Luo Wen, Yuan LinWang, Yu ZhaoYuan

2605.01296 2026-05-05 cs.CV

SIFT-VTON: Geometric Correspondence Supervision on Cross-Attention for Virtual Try-On

Kosuke Takemoto, Takafumi Koshinaka

Comments Accepted at ICPR2026

2605.01295 2026-05-05 cs.LG cs.AI

Autonomous Drift Learning in Data Streams: A Unified Perspective

Xiaoyu Yang, En Yu, Jie Lu

Comments Survey Paper, 20 pages

2605.01293 2026-05-05 cs.AI

Lifting Traces to Logic: Programmatic Skill Induction with Neuro-Symbolic Learning for Long-Horizon Agentic Tasks

Jie-Jing Shao, Haiyan Yin, Yueming Lyu, Xingrui Yu, Lan-Zhe Guo, Ivor Tsang, James Kwok, Yu-Feng Li

Comments ICML 2026

2605.01292 2026-05-05 cs.CL

Addressing Data Scarcity in Bangla Fake News Detection: An LLM-Based Dataset Augmentation Approach

Ahmed Alfey Sani, Kazi Akib Zaoad, Shefayat E Shams Adib, Md Abdul Muqtadir, Ajwad Abrar

Comments Accepted in 15th ACM ICSCA, 2026 in Langkawi, Malaysia

2605.01289 2026-05-05 cs.RO

Bi-Level Reinforcement Learning Control for an Underactuated Blimp via Center-of-Mass Reconfiguration

Xiaorui Wang, Hongwu Wang, Yue Fan, Hao Cheng, Feitian Zhang

2605.01283 2026-05-05 cs.CV cs.AI

Developing a Strong Pre-Trained Base Model for Plant Leaf Disease Classification

David J. Richter

Comments Master's thesis

详情

DOI: 10.23173/jnu.000000076557.24010.0012721

英文摘要

Plants, crops and their yields are essential to our very existence, but diseases and pests cause large losses every year. As such it is vital to ensure that diseases can be spotted early and treated accordingly and stopping the spread while still possible. Manual and traditional methods require personal to walk through the field and check for symptoms 'by hand'. This is very laborious and very time consuming, so ML methods have been applied as a result and they have garnered promising results. CNN models are especially efficient as they can automatically extract features from images without any manual feature construction before then feeding the features to a classifier. Datasets are largely influential to the final performance of the model. Despite the importance that datasets pose to the field, there still seems to be somewhat of a discrepancy between what is publicly available for use and what would be required to sufficiently train fully capable models. To overcome these shortcomings, as part of this thesis open datasets for the field of plant leaf disease classification have been identified as well as models that can be trained on them and extensive benchmarks have been carried out to identify their suitability. Then a new dataset was constructed based on those findings as well as on the findings of a augmentation applicability study, which will be used to train a new Base Model based on the DenseNet201 architecture, which managed to outperform the baseline model on said new dataset as well as outperforming it on plant leaf disease classification domain specific Transfer-Learning experiments on another new dataset. This new model manages to train models through Transfer-Learning (TL) faster, more robust, more stable, and with less data than general model would, overcoming a large number of issues that the field still suffers from.

URL PDF HTML ☆

赞 0 踩 0

2605.01278 2026-05-05 cs.AI

Valley3: Scaling Omni Foundation Models for E-commerce

Zeyu Chen, Guanghao Zhou, Qixiang Yin, Ziwang Zhao, Huanjin Yao, Pengjiu Xia, Min Yang, Cen Chen, Minghui Qiu

2605.01277 2026-05-05 cs.CV cs.AI

CNN-based Multi-In-Multi-Out Model for Efficient Spatiotemporal Prediction

Hyeonseok Jin

Comments Master's thesis

详情

DOI: 10.23173/jnu.000000076408.24010.0012633

英文摘要

Recently, Convolutional Neural Network (CNN) or Transformer architecture based models have been proposed to overcome the limitations of Recurrent Neural Network (RNN) based models in spatiotemporal prediction. These models prevent the inefficiency of parallelization limitation due to the sequential properties and stacked error due to the recursive method, and show high performance. Novertheless, there are still some challengies. First, CNN based models have difficulty considering global information due to the local properties of the kernel, and their performance is limited. In addition, information is mixed because the time axis is combined with the channel axis of the image for processing. Models based on Transformer architecture have high complexity due to the self-attention calcuation and take a long training time. In this paper, we propose a new structure model called CNN-based Multi-In-Multi-Out model for Efficient Spatiotemporal Prediction (MIMO-ESP) to overcome these limitations. MIMO-ESP considers global information and significantly improves complexity by configuring a Transformer architecture based on CNN. In addition, it treats the time axis as an independent axis without combining it, and effectively considers spatiotemporal information together by applying dilation. This structure makes MIMO-ESP efficient and high performance. Extensive experiment results on three promising benchmark datasets which including video, traffic, and precipitation prediction tasks demonstrate that the usefulness of MIMO-ESP due to the achieved competitive efficiency while outperforming existing models. Furthermore, the ablation study results demonstrate the usefulness of the components of MIMO-ESP, emphasizing the potential of the proposed approaches.

URL PDF HTML ☆

赞 0 踩 0

2605.01272 2026-05-05 cs.CV eess.IV

GameScope: A Multi-Attribute, Multi-Codec Benchmark Dataset for Gaming Video Quality Assessment

Rajesh Sureddi, Shreshth Saini, Avinab Saha, Alan C. Bovik

2605.01266 2026-05-05 cs.CV

Exploring Prompt Alignment with Clinical Factors in Zero-Shot Segmentation VLMs for NSCLC Tumor Segmentation

Suraj Pai, Thibault Heintz, Cosmin Ciausu, Marion Tonneau, Hugo Aerts, Raymond Mak

2605.01257 2026-05-05 cs.AI

Uncertainty-Aware Trip Purpose Inference from GPS Trajectories via POI Semantic Zones and Pareto Calibration

Bo Yang, Haoxuan Ma, Yifan Liu, Zhiyuan Zhang, Chris Stanford, Morgan Sun, Jiaqi Ma

2605.01256 2026-05-05 cs.CL

GIFT: Guided Fine-Tuning and Transfer for Enhancing Instruction-Tuned Language Models

Zhiwen Ruan, Yichao Du, Jianjie Zheng, Longyue Wang, Yun Chen, Peng Li, Jinsong Su, Yang Liu, Guanhua Chen

2605.01255 2026-05-05 cs.LG

Activation Compression in LLMs: Theoretical Analysis and Efficient Algorithm

Wen-Da Wei, Han-Bin Fang, Yang-Di Liu, Jiang-Xin Shi, James Kwok, Yu-Feng Li

2605.01250 2026-05-05 cs.AI

EO-Gym: A Multimodal, Interactive Environment for Earth Observation Agents

Sai Ma, Zhuang Li, Sichao Li, Xinyue Xu, Ruibiao Zhu, Tony Boston, John A. Taylor

2605.01242 2026-05-05 cs.LG

Breaking the Computational Barrier: Provably Efficient Actor-Critic for Low-Rank MDPs

Ruiquan Huang, Donghao Li, Yingbin Liang, Jing Yang

Comments accepted by ICML2026

2605.01236 2026-05-05 cs.CV

Degradation-Aware Adaptive Context Gating for Unified Image Restoration

Lei He, Jielei Chu, Fengmao Lv, Weide Liu, Tianrui Li, Jun Cheng, Yuming Fang

2605.01234 2026-05-05 cs.CV

TT4D: A Pipeline and Dataset for Table Tennis 4D Reconstruction From Monocular Videos

Nima Rahmanian, Daniel Kienzle, Thomas Gossard, Dvij Kalaria, Rainer Lienhart, Shankar Sastry