arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.17423 2026-06-17 q-fin.CP stat.ML 新提交

Martingale Doppelgänger-Eval: An Identification Framework for Auditing Candlestick Understanding in Vision-Language Models

鞅双生评估：审计视觉语言模型对K线图理解的识别框架

Ziyao Wang

AI总结提出Martingale Doppelgänger-Eval基准，通过受控实验识别VLM是否基于K线证据而非趋势外推进行判断，发现模型忽略或反向利用K线语义。

详情

AI中文摘要

我们引入了Martingale Doppelgänger-Eval，一个公开的影子市场基准，用于审计视觉语言模型（VLM）是否使用K线证据而非外推过去趋势。核心困难在于识别：在真实市场历史中，图表证据和趋势高度耦合，因此观测得分无法确定流畅的技术分析叙述是否基于局部视觉证据。我们形式化证明了这一局限性：在强耦合下，没有基于观测的图表-标签数据计算的评估函数能够区分基于证据的响应者和基于趋势捷径的响应者，而匹配的证据干预以指数速率区分相同的响应者，趋势-标签交换提供了独立的捷径压力测试。因此，该基准在四种受控机制下评估冻结的VLM：鞅零市场、注入阿尔法的反事实对、趋势混杂交换和制度转换。结构行为模型识别了零市场偏差、趋势敏感性、证据敏感性、提示/渲染器脆弱性和证据忠实性；附带的统计工具包提供了最小可检测效应、针对计量API的块感知序贯测试以及重叠加权伪影检查。在冻结的商业和开源VLM中，识别回归将大的正系数分配给过去趋势，但证据系数为零或与规则隐含符号相反。匹配对分析表明，模型要么忽略注入的K线语义，要么在响应时朝与规则隐含方向相反的方向移动。该基准隔离了标准观测图表基准无法检测的失败模式，并为具有可控标签机制的时间序列图像提供了可复用的审计模板。

英文摘要

We introduce Martingale Doppelgänger-Eval, a public shadow-market benchmark for auditing whether vision-language models (VLMs) use candlestick evidence rather than extrapolate past trends. The central difficulty is identification: on real market histories, chart evidence and trend are strongly coupled, so an observational score cannot determine whether a fluent technical-analysis narrative is grounded in local visual evidence. We prove this limitation formally: no evaluation functional computed from observational chart--label data can distinguish a grounded responder from a trend-shortcut responder under strong coupling, whereas matched evidence interventions separate the same responders at an exponential rate and trend--label swaps provide an independent shortcut stress test. The benchmark therefore evaluates frozen VLMs on rendered OHLCV charts under four controlled mechanisms: a martingale-null market, injected-alpha counterfactual pairs, trend-confounder swaps, and regime shifts. A structural behavioral model identifies null-market bias, trend sensitivity, evidence sensitivity, prompt/renderer fragility, and evidence faithfulness; the accompanying statistical toolkit provides minimum detectable effects, block-aware sequential testing for metered APIs, and an overlap-weighted artifact check. Across frozen commercial and open VLMs, the identified regression assigns large positive coefficients to past trend but evidence coefficients that are zero or opposite to the rule-implied sign. Matched-pair analyses show that models either ignore injected candlestick semantics or move opposite to the rule-implied direction conditional on responding. The benchmark isolates a failure mode that standard observational chart benchmarks cannot detect and gives a reusable audit template for time-series imagery with controllable label mechanisms.

URL PDF HTML ☆

赞 0 踩 0

2606.17383 2026-06-17 q-fin.RM cs.AI cs.LG stat.ML 新提交

Model Validation of Agentic AI Systems: A POMDP-Based Framework for Belief-State, Forecast, and Policy Validation

智能体AI系统的模型验证：基于POMDP的信念状态、预测与策略验证框架

Matthew Francis Dixon

发表机构 * Quiota LLC（Quiota公司）

AI总结提出基于部分可观测马尔可夫决策过程（POMDP）的智能体AI模型验证框架，将自主决策分解为信息、信念、预测、动作和效用组件独立验证，并通过投资组合管理案例展示其有效性。

Comments 28 pages, 3 figures, 6 tables. Source code available from this https URL (https://github.com/mfrdixon/agentic-AI-as-POMDP)

详情

AI中文摘要

智能体人工智能系统引入了一类新的模型风险。与传统预测模型不同，自主智能体持续获取信息，形成关于环境潜在状态的信念，生成预测，选择行动，并随时间调整其行为。现有的验证方法主要关注预测准确性，因此对底层决策过程的质量提供的洞察有限。本文提出了一种基于部分可观测马尔可夫决策过程（POMDP）的智能体AI模型验证框架。该框架将自主决策分解为信息、信念、预测、行动和效用，允许每个组件独立验证。大型语言模型（LLM）被形式化为近似贝叶斯滤波算子，并开发了一个模型风险分类体系，涵盖状态空间、滤波、预测、策略、效用规范和参数风险。通过一个投资组合管理案例研究展示了模型风险验证方法，其中智能体从市场和宏观经济信息中推断潜在市场制度，生成基于信念的预测，并使用Black-Litterman框架构建投资组合。实证验证结合了性能分析、信念校准诊断、覆盖测试、消融研究和参数敏感性分析。结果表明，潜在状态推断对决策质量有独立贡献，且主要结论在广泛的参数值范围内保持稳健。本文的主要贡献是提供了一个实用框架，将已建立的模型风险管理概念扩展到自主AI系统，并为其验证、治理和监控提供了严格的基础。

英文摘要

Agentic artificial intelligence systems introduce a new class of model risk. Unlike traditional predictive models, autonomous agents continuously acquire information, form beliefs regarding latent states of the environment, generate forecasts, select actions, and adapt their behavior over time. Existing validation methodologies focus primarily on predictive accuracy and therefore provide limited insight into the quality of the underlying decision process. This paper proposes a model validation framework for agentic AI based on Partially Observable Markov Decision Processes (POMDPs). The framework decomposes autonomous decision making into information, beliefs, forecasts, actions, and utility, allowing each component to be validated independently. Large language models (LLMs) are formalized as approximate Bayesian filtering operators, and a model-risk taxonomy is developed encompassing state-space, filtering, forecast, policy, utility-specification, and parameter risks. The model risk validation methodology is demonstrated through a portfolio-management case study in which an agent infers latent market regimes from market and macroeconomic information, generates belief-conditioned forecasts, and constructs portfolios using a Black--Litterman framework. Empirical validation combines performance analysis, belief calibration diagnostics, coverage tests, ablation studies, and parameter-sensitivity analysis. The results indicate that latent-state inference contributes independently to decision quality and that the principal conclusions remain robust across a broad range of parameter values. The principal contribution of the paper is a practical framework for extending established model risk management concepts to autonomous AI systems and providing a rigorous foundation for their validation, governance, and monitoring.

URL PDF HTML ☆

赞 0 踩 0

2606.17065 2026-06-17 q-fin.CP cs.AI cs.LG 新提交

PIVOT: Bridging Black-Scholes Implied-Volatility and Price Objectives via Differentiable Jäckel Operator

PIVOT: 通过可微分的Jäckel算子桥接Black-Scholes隐含波动率与价格目标

Raeid Saqur, Yannick Limmer, Anastasis Kratsios, Blanka Horvath, Hans Buehler

发表机构 * Mathematical Institute, University of Oxford（牛津大学数学研究所）； McMaster University（麦基尔大学）； Vector Institute for AI（人工智能矢量研究所）； DRW

AI总结提出PIVOT层，通过隐式微分保留Jäckel求解器的前向精度，并利用门控机制处理低vega区域的奇异性，实现价格与隐含波动率空间的高效可微转换。

Comments 30 pages, 17 figures, 12 tables

详情

AI中文摘要

现代期权学习系统在两种坐标系下运行：价格空间（市场报价且无套利约束最自然执行）和隐含波动率（IV）空间（波动率曲面被平滑、正则化和评估）。瓶颈在于接口而非近似：Jäckel开创性的“Let's Be Rational”（LBR）求解器已经高效地将Black-Scholes价格反转到机器精度。所缺少的是一个可微分层，它在正向传播中保留LBR，并避免通过其分支逻辑进行反向传播。这样的层还必须面对低vega区域中逆映射不可避免的奇异性，其中灵敏度1/vega在vega→0时发散。我们通过PIVOT（价格-隐含波动率目标转换器）填补了这一空白。PIVOT保持LBR正向传播不变，并通过隐式微分通过平滑的Black-Scholes/Black-76价格映射提供反向传播，并带有显式门控合约：无效域返回NaN，良态行接收精确的1/vega梯度，低vega行被衰减而非静默正则化。在单个H100上，融合的Triton内核在机器精度下达到1.79e9 IV/s（与参考C求解器的最大相对误差为9.3e-14）；端到端标签生成在合成链上维持48.9M/s，在SPX OptionMetrics上维持16.6M/s。在SPX上的HyperIV风格单日复现中，PIVOT增强目标帕累托主导基线，将保留价格MAE降低高达43.4%，最强的三种子门控目标联合改善价格MAE 38.8%和IV MAE 21.3%；在RUT、VIX和NDX上的跨资产结果显示方向性价格MAE增益分别为40.1%、24.2%和16.7%，而无门控的IV往返控制崩溃为退化的近零曲面，确认门控是正确性合约而非调节旋钮。

英文摘要

Modern option-learning systems operate in two coordinates: price space, where markets quote and no-arbitrage constraints are most naturally enforced, and implied volatility (IV) space, where volatility surfaces are smoothed, regularized, and evaluated. The bottleneck is interface, not approximation: Jäckel's seminal "Let's Be Rational" (LBR) solver already inverts the Black-Scholes price to machine precision efficiently. What is missing is a differentiable layer that preserves LBR in the forward pass and avoids backpropagating through its branch logic. Such a layer must also confront the unavoidable singularity of the inverse map in the low-vega regime, where the sensitivity 1/vega diverges as vega -> 0. We close this gap with PIVOT, the Price-Implied-Volatility Objective Translator. PIVOT keeps the LBR forward pass intact and supplies the backward pass by implicit differentiation through the smooth Black-Scholes/Black-76 price map, with an explicit gating contract: invalid domains return NaN, well-conditioned rows receive the exact 1/vega gradient, and low-vega rows are attenuated rather than silently regularized. On a single H100, a fused Triton kernel reaches 1.79e9 IV/s at machine precision (9.3e-14 max relative error vs. the reference C solver); end-to-end label generation sustains 48.9M/s on synthetic chains and 16.6M/s on SPX OptionMetrics. In a HyperIV-style one-day reproduction on SPX, PIVOT-augmented objectives Pareto-dominate the baselines, reducing held-out price MAE by up to 43.4% and the strongest three-seed gated objective improving price MAE by 38.8% and IV MAE by 21.3% jointly; cross-asset results on RUT, VIX, and NDX show directional price-MAE gains of 40.1%, 24.2%, and 16.7%, while an ungated IV-roundtrip control collapses to a degenerate near-zero surface, confirming the gate as a correctness contract rather than a tuning knob.

URL PDF HTML ☆

赞 0 踩 0

2606.17545 2026-06-17 cs.LG q-fin.CP q-fin.PR 新提交

Continuous-time Optimal Stopping through Deep Reinforcement Learning

通过深度强化学习的连续时间最优停止

Cosmin Borsa, Michael Ludkovski

发表机构 * Department of Statistics & Applied Probability, UC Santa Barbara（加州大学圣塔芭芭拉分校统计与应用概率系）

AI总结提出CARLOS算法，利用聚合深度神经网络学习任意精细时间分辨率下的停止规则，通过渐进式时间网格细化和自适应采样，逼近美式期权价格上界。

Comments 33 pages

详情

AI中文摘要

基于仿真的最优停止问题求解器必须离散化停止决策。在经典动态规划下，粗网格（只有少数停止机会）会显著低估最优期望回报，而在极细网格上，近似误差通过反向递归累积。为消除这一限制，我们开发了一种新的强化学习启发算法，能够在任意精细时间分辨率下学习停止规则。我们的CARLOS（连续时间自适应强化学习最优停止）算法利用聚合深度神经网络（ADNN）学习联合时空决策边界。从粗时间网格开始，我们逐步增加停止机会的频率，同时并行训练ADNN以精化其时机-价值估计。此外，我们设计了一种自适应采样策略，逐渐将训练集中到停止边界附近。基准测试结果表明，CARLOS相比现有百慕大求解器提供更高的价格，接近美式上界，并且相对于非RL比较器实现了高计算效率。

英文摘要

Simulation based solvers for optimal stopping problems must discretize the stopping decision. Under classical dynamic programming, a coarse exercise grid with only a few stopping opportunities can materially undervalue the optimal expected reward, whereas on a very fine grid, approximation errors accumulate through the backward recursion. To remove this limitation, we develop a new reinforcement-learning inspired algorithm that enables us to learn the exercise rule at arbitrarily fine time resolution. Our CARLOS (Continuous-time Adaptive Reinforcement Learning for Optimal Stopping) algorithm utilizes an aggregate deep neural network (ADNN) to learn a joint space-time decision boundary. Starting from a coarse time grid, we progressively increase the frequency of stopping opportunities, while in parallel training the ADNN to refine its timing-value estimates. We moreover design an adaptive sampling strategy that gradually concentrates training effort near the stopping boundary. Benchmarked results show that CARLOS delivers higher prices than existing Bermudan solvers, approaching the American upper bound, and achieves high computational efficiency relative to non-RL comparators.

URL PDF HTML ☆

赞 0 踩 0

2606.18199 2026-06-17 math.ST q-fin.RM stat.ME stat.ML 新提交

Conformal Prediction Intervals with Tail-Specific Guarantees

具有尾部特定保证的共形预测区间

Simone Cuonzo, Nina Deliu

AI总结本文扩展经典共形框架，通过构造单侧共形区间并取交集得到双侧区间，为上下尾分别提供显式校准的覆盖保证，理论证明尾部特定和全局边际覆盖，在偏态数据中改善方向校准。

详情

AI中文摘要

本文将构造具有全局边际覆盖$1-\alpha$的预测区间的经典共形框架扩展到为上下尾分别提供显式校准保证的区间。聚焦于分裂共形预测，我们首先构造实现边际有效性的下侧和上侧单侧共形区间，然后通过交集导出双侧区间。理论结果证明了所导出的双侧区间的尾部特定和全局边际覆盖。结果首先在可交换设定下给出，其中覆盖具有有限样本保证，然后针对非可交换数据，其中保证是渐近的。模拟研究表明，相对于经典双侧区间，所提出的方法实现了改进的方向校准，在偏态数据中尤其相关。最后，在一个金融应用中展示了所提出框架的优势，其中目标是最大化收益同时寻求对左尾的严格控制。

英文摘要

This paper extends classical conformal frameworks for constructing prediction intervals with global marginal coverage $1-\alpha$ to intervals that provide explicitly calibrated guarantees for the upper and lower tails separately. Focusing on split conformal prediction, we first construct lower and upper one-sided conformal intervals that achieve marginal validity, and then derive the induced two-sided interval by intersection. Theoretical results prove both tail-specific and global marginal coverage of the induced two-sided interval. Results are presented first for the exchangeable setting, where coverage has finite-sample guarantees, and then for non-exchangeable data, where guarantees are asymptotic. Simulation studies show that the proposed approach achieves improved directional calibration relative to classical two-sided intervals, especially relevant in skewed data. Finally, the benefit of the proposed framework is showcased in a financial application, where one aims for return maximization while seeking strict control on the left tail.

URL PDF HTML ☆

赞 0 踩 0

2606.12872 2026-06-17 q-fin.PR 新提交

Non-Spanning Identification of Scheduled Event Risk in Option Pricing

期权定价中计划事件风险的非跨越识别

Tenghan Zhong

AI总结提出非跨越识别协议，通过非跨越到期日估计无事件波动率曲面，利用跨越事件训练报价校准计划跳跃，在S&P 500指数期权上验证了混合跳跃模型对事件跨越定价的改进。

详情

AI中文摘要

短期指数期权使计划中的宏观公告风险在市场定价中可见，但识别并非易事：一个灵活的无事件曲面拟合跨越事件报价会吸收事件溢价，而未经跨越事件报价校准的跳跃模型则无法识别。因此，我们将联邦公开市场委员会（FOMC）决策、消费者价格指数（CPI）发布和非农就业（NFP）报告建模为风险中性期权定价中的确定性时间跳跃，并提出一种非跨越识别协议。非跨越到期日识别无事件波动率曲面，跨越事件训练报价校准计划跳跃，而保留的跨越事件报价仅用于定价评估。在2022年5月至2025年8月的PM结算S&P 500指数（SPX）期权上，高斯和双成分混合跳跃改进了保留的跨越事件定价，最显著的改进体现在稳健的中位数定价误差以及事件波动率期权组合（跨式期权和宽跨式期权）上，而非方向性风险逆转。污染曲面压力测试确认了识别问题：允许跨越事件训练报价进入无事件曲面拟合会通过吸收事件溢价而非识别计划跳跃风险来产生强大的保留性能。一个摊销混合密度网络（MDN）基准显示出有限的跨事件迁移：纯留一事件外摊销降低了隐含波动率误差，但未降低平均美元或平均价差归一化定价误差，而尺度校准变体恢复了高斯级性能，但仍低于事件特定混合校准。计划跳跃识别对CPI和FOMC最强，对NFP较弱。

英文摘要

Short-dated index options make scheduled macro-announcement risk visible in market prices, but visibility does not imply identification: a flexible no-event surface fitted to event-spanning quotes can absorb event premia, while a jump calibrated without event-spanning quotes is unidentified. To separate the continuous surface from the scheduled jump, we model Federal Open Market Committee (FOMC) decisions, Consumer Price Index (CPI) releases, and nonfarm payroll (NFP) reports as deterministic-time jumps in risk-neutral option pricing and propose a non-spanning identification protocol. Non-spanning expiries identify the no-event volatility surface, event-spanning training quotes calibrate the scheduled jump, and held-out event-spanning quotes are used only for pricing evaluation. On PM-settled S\&P 500 index (SPX) options from May 2022 to August 2025, Gaussian and two-component mixture jumps improve held-out event-spanning pricing, with the clearest gains in robust median pricing errors and in event-volatility option combinations (straddles and strangles) rather than directional risk reversals. A contaminated-surface stress test confirms the identification concern: allowing event-spanning training quotes into the no-event surface fit produces strong held-out performance by absorbing event premia rather than identifying scheduled jump risk. An amortized mixture density network (MDN) benchmark shows limited cross-event transfer: pure leave-one-event-out amortization reduces implied-volatility errors but not mean dollar or mean spread-normalized pricing errors, while the scale-calibrated variant restores Gaussian-level performance yet remains below event-specific mixture calibration. Scheduled-jump identification is strongest for CPI and FOMC and weaker for NFP.

URL PDF HTML ☆

赞 0 踩 0

2606.03767 2026-06-17 econ.TH q-fin.GN 版本更新

Trading Frictions in Dynamic Cap-and-Trade Markets

动态总量控制与交易市场中的交易摩擦

Nicola Borri, Yukun Liu, Aleh Tsyvinski, Xi Wu

AI总结本文通过构建包含多种交易摩擦的动态随机市场模型，研究总量控制与交易市场中交易摩擦如何影响市场有效性，并利用欧盟排放交易体系（EU ETS）2005-2021年的270万笔交易和合规记录进行量化分析。

详情

AI中文摘要

我们开发了一个具有外部性和多种交易摩擦的市场动态随机模型，以总量控制与交易作为主要应用。缓慢参与、有限中介和异质信息在均衡中相互作用：代理人选择昂贵的市场准入，准入决定剩余合规需求，中介约束将剩余需求转化为交割月溢价，而溢价又反馈到准入激励中。这些相互作用塑造了市场纠正外部性的有效性。我们以闭式解刻画了准入选择，证明了均衡溢价的唯一性，并表明内生准入削弱了对单个摩擦的反应，而多种摩擦的相互作用是非加性的，且可能放大价格反应。我们使用2005-2021年欧盟排放交易体系（EU ETS）的270万笔注册交易和合规记录对模型进行了量化。约40%的运营商每年不进行交易，购买集中在4月，此时回报系统性偏高，且运营商流量预测未来回报。

英文摘要

We develop a dynamic stochastic model of markets with an externality and multiple trading frictions, and cap-and-trade as the leading application. Slow participation, limited intermediation, and heterogeneous information interact in equilibrium: agents choose costly market access, access determines residual compliance demand, intermediary constraints translate residual demand into a surrender-month premium, and the premium feeds back into access incentives. These interactions shape how effectively the market corrects the externality. We characterize access choices in closed form, prove that the equilibrium premium is unique, and show that endogenous access dampens the response to each friction in isolation, while the interaction of multiple frictions is non-additive and can amplify the price response. We quantify the model using 2.7 million EU ETS registry transactions and compliance records from 2005-2021. About 40% of operators do not trade annually, purchases concentrate in April when returns are systematically high, and operator flow predicts future returns.

URL PDF HTML ☆

赞 0 踩 0

2502.17518 2026-06-17 cs.LG cs.AI q-fin.CP stat.ML 版本更新

Ensemble RL through Classifier Models: Enhancing Risk-Return Trade-offs in Trading Strategies

通过分类器模型进行集成强化学习：在交易策略中增强风险回报权衡

Zheli Xiong

AI总结本文研究了在金融交易策略中使用集成强化学习模型的全面研究，利用分类器模型来提升性能。通过将A2C、PPO和SAC等强化学习算法与传统分类器如支持向量机（SVM）、决策树和逻辑回归相结合，探讨不同分类器组如何整合以改善风险回报权衡。研究评估了各种集成方法的有效性，将其与单个强化学习模型在关键金融指标（包括累计回报率、夏普比率（SR）、卡勒姆比率和最大回撤（MDD））上进行比较。结果表明，集成方法在风险调整后的回报方面始终优于基础模型，提供了更好的回撤管理和整体稳定性。然而，我们发现集成性能对方差阈值τ的选择敏感，强调了动态调整τ以达到最佳性能的重要性。本研究强调了将强化学习与分类器结合在自适应决策中的价值，对金融交易、机器人和其他动态环境具有启示。

Comments 23 pages,10 figures, 9 table

详情

AI中文摘要

本文提出了一项全面研究，探讨在金融交易策略中使用集成强化学习（RL）模型的应用，利用分类器模型来提升性能。通过结合A2C、PPO和SAC等强化学习算法与传统分类器如支持向量机（SVM）、决策树和逻辑回归，我们研究了不同分类器组如何整合以改善风险回报权衡。研究评估了各种集成方法的有效性，将其与单个RL模型在关键金融指标（包括累计回报率、夏普比率（SR）、卡勒姆比率和最大回撤（MDD））上进行比较。我们的结果表明，集成方法在风险调整后的回报方面始终优于基础模型，提供了更好的回撤管理和整体稳定性。然而，我们发现集成性能对方差阈值τ的选择敏感，强调了动态调整τ以达到最佳性能的重要性。本研究强调了将强化学习与分类器结合在自适应决策中的价值，对金融交易、机器人和其他动态环境具有启示。

英文摘要

This paper presents a comprehensive study on the use of ensemble Reinforcement Learning (RL) models in financial trading strategies, leveraging classifier models to enhance performance. By combining RL algorithms such as A2C, PPO, and SAC with traditional classifiers like Support Vector Machines (SVM), Decision Trees, and Logistic Regression, we investigate how different classifier groups can be integrated to improve risk-return trade-offs. The study evaluates the effectiveness of various ensemble methods, comparing them with individual RL models across key financial metrics, including Cumulative Returns, Sharpe Ratios (SR), Calmar Ratios, and Maximum Drawdown (MDD). Our original experimental results demonstrate that ensemble methods often outperform base models in terms of risk-adjusted returns, providing better management of drawdowns and overall stability. However, both the original analysis and the additional reproduction reported in this version show that ensemble performance is sensitive to the choice of variance threshold $\tau$, classifier group, RL-agent pair, and market universe. The reproduction evidence strengthens the conclusion that classifier-assisted ensemble selection can improve robustness, while also clarifying that the advantage is conditional rather than automatic across all datasets. This study emphasizes the value of combining RL with classifiers for adaptive decision-making, with implications for financial trading, robotics, and other dynamic environments.

URL PDF HTML ☆

赞 0 踩 0

2412.00607 2026-06-17 stat.ME q-fin.RM 版本更新

On a risk model with tree-structured Poisson Markov random field frequency, with application to rainfall events

基于树结构泊松马尔可夫随机场频率的风险模型及其在降雨事件中的应用

Hélène Cossette, Benjamin Côté, Alexandre Dubeau, Etienne Marceau

AI总结提出一种树结构泊松马尔可夫随机场模型来刻画组合风险中的频率相依性，研究无限增长树上的渐近风险，并在极端降雨数据上验证了模型灵活性和可扩展性。

Comments 40 pages

2501.00826 2026-06-17 q-fin.TR cs.AI 版本更新

LLM-Powered Multi-Agent System for Automated Crypto Portfolio Management

基于LLM的多智能体系统实现自动化加密货币投资组合管理

Yichen Luo, Yebo Feng, Jiahua Xu, Paolo Tasca, Yang Liu

AI总结提出一个三智能体系统（市场、新闻、交易），通过分层、协作和辩论架构融合多模态信号，在2025年回测中实现133.52%累计收益和1.502夏普比率，优于单智能体和深度学习基线。

详情

AI中文摘要

加密货币投资组合管理需要在高度波动和实时约束下融合异构多模态信号，包括结构化的价格和链上时间序列、非结构化的新闻文本以及技术指标。虽然深度学习方法显示出预测能力，但其不透明性限制了实际应用，而单个大语言模型（LLM）智能体难以处理稳健决策所需的多模态输入广度。我们提出一个多智能体系统（MAS）框架，其中三个模态专业智能体——负责市场动态的加密货币智能体、负责每周新闻情绪的新闻智能体和负责信号融合与投资组合执行的交易智能体——通过三种通信架构（分层、协作和辩论）分解任务。我们评估了四种能力配置：零样本、思维链（CoT）、检索增强生成（RAG）和技能增强。在2025年1月按市值排名前15的L1区块链原生加密货币的52周回测中，最佳配置（分层技能）实现了133.52%的累计收益和1.502的夏普比率，优于单智能体变体、被动基准和深度学习基线。消融研究确定加密货币智能体是最关键的组件，移除它会使累计收益降低42.57个百分点。跨模型比较进一步表明，在GPT-4o、GPT-5和Claude Sonnet 4.5下，MAS均优于单智能体基线，表明多智能体协调的优势与模型无关。与黑箱深度学习模型不同，每个投资组合决策都可追溯到明确的智能体推理，为多模态加密货币投资组合管理提供了一种可解释且有效的方法。

英文摘要

Cryptocurrency portfolio management requires the fusion of heterogeneous multi-modal signals, including structured price and on-chain time series, unstructured news text, and technical indicators, under high-volatility and real-time constraints. While deep learning approaches show predictive capability, their opacity limits practical adoption, and single large language model (LLM) agents struggle to process the breadth of modality-specific inputs needed for robust decision-making. We propose a multi-agent system (MAS) framework in which three modality-specialised agents, a Crypto Agent for market dynamics, a News Agent for weekly news sentiment, and a Trading Agent for signal fusion and portfolio execution, decompose the task across three communication architectures: hierarchical, collaborative, and debate. We evaluate four capability configurations: zero-shot, chain-of-thought (CoT), retrieval-augmented generation (RAG), and skill-augmented. In a 52-week backtest over calendar year 2025 across the top 15 L1 blockchain native cryptocurrencies by market capitalisation as of January 2025, the best configuration, Hierarchical (Skill), achieves a cumulative return of 133.52% and a Sharpe ratio of 1.502, outperforming single-agent variants, passive benchmarks, and deep learning baselines. An ablation study identifies the Crypto Agent as the most critical component, with its removal reducing cumulative return by 42.57 percentage points. A cross-model comparison further shows that MAS outperforms the single-agent baseline under GPT-4o, GPT-5, and Claude Sonnet 4.5, suggesting that the benefit of multi-agent coordination is model-agnostic. Unlike black-box deep learning models, every portfolio decision is traceable to explicit agent reasoning, offering an interpretable and effective approach to multi-modal cryptocurrency portfolio management.

URL PDF HTML ☆

赞 0 踩 0

2111.14631 2026-06-17 q-fin.RM math.PR q-fin.CP q-fin.PM 版本更新

Model Risk in Credit Portfolio Models

信用组合模型中的模型风险

Christian Meyer

AI总结针对银行信用组合模型中的模型风险，提出一种全面且易于实施的方法来处理所有模型参数的不确定性。

Comments 12 pages, 2 figures. This version: minor corrections, updates, and comments