arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2237
专题追踪
2606.17939 2026-06-17 stat.AP stat.ML 新提交

Understanding Long-Term Dynamics of Individual Metro Usage: A Hidden Semi-Markov State Framework with Survival Analysis

理解个体地铁使用的长期动态:基于生存分析的隐半马尔可夫状态框架

Bingxun Wang, Valeria Maria Urbano, Shan He, Yang Chen, Wei Liu, Zhibin Jiang, Piercesare Secchi

AI总结 提出融合隐半马尔可夫模型与离散时间生存分析的框架,利用上海地铁四年刷卡数据识别五种可解释的出行状态及其转移层次,揭示退出风险与状态相关但独立于时长,而重返风险随不活跃时长急剧衰减。

详情
AI中文摘要

理解个体地铁使用在多年时间尺度上的演化对于交通规划和乘客留存至关重要。然而,现有方法通常将移动模式表征为静态聚类或短期变化,忽略了交通参与的生命周期动态。本研究提出一个基于状态的生命周期建模框架,将隐半马尔可夫模型(HSMM)与离散时间生存分析相结合,以刻画个体地铁移动性的演化。HSMM推断具有显式持续时间分布的潜在移动状态以及控制状态变迁的转移矩阵,而生存组件通过依赖于移动状态轨迹和行为历史的状态相关风险函数,对退出和重新进入事件进行建模。将该框架应用于上海地铁系统四年(2021-2024)的智能卡数据,能够识别可解释的移动状态,刻画转移动态,并量化状态依赖的退出和重新进入过程。分析揭示了五种稳健的移动状态,具有以偶尔使用网关状态为中心的方向性转移层次,以及控制脱离和回归的根本不同的时间机制:退出风险与状态相关但与持续时间无关,而重新进入风险随不活跃时长急剧衰减。这些发现为面向生命周期的移动性分析提供了方法论基础,并为交通运营商识别风险用户和安排留存干预提供了实践指导。

英文摘要

Understanding how individual metro usage evolves over multi-year horizons is essential for transit planning and passenger retention. However, existing approaches typically characterize mobility patterns as static clusters or short-term variability, leaving the lifecycle dynamics of transit participation underexplored. This study proposes a state-based lifecycle modeling framework that integrates Hidden Semi-Markov Models (HSMM) with discrete-time survival analysis to characterize the evolution of individual metro mobility. The HSMM infers latent mobility states with explicit duration distributions and a transition matrix governing regime changes, while the survival component models exit and re-entry events via state-dependent hazard functions conditioned on mobility-state trajectories and behavioral history. Applied to four years of smart card data from the Shanghai metro system (2021-2024), the framework enables the identification of interpretable mobility states, the characterization of transition dynamics, and the quantification of state-dependent exit and re-entry processes. The analysis reveals five robust mobility states with a directional transition hierarchy centered on an occasional-usage gateway state, and fundamentally different temporal mechanisms governing disengagement and return: exit hazard is state-dependent but duration-independent, whereas re-entry hazard decays sharply with inactivity length. These findings provide a methodological foundation for lifecycle-oriented mobility analysis and practical guidance for transit operators to identify at-risk users and time retention interventions.

2606.17923 2026-06-17 stat.ME 新提交

Spatial mixed models for assessing environmental exposure effects on the microbiome

评估环境暴露对微生物组影响的空间混合模型

Sooran Kim, Chan Wang, Soyoung Kwak, Fares Darawshy, Alexander Bain, Leopoldo N. Segal, Jiyoung Ahn, Huilin Li

AI总结 提出一种空间混合模型框架,利用条件自回归先验同时处理区域空间依赖和分类群生态依赖,在特征选择中实现高检测功率和低假阳性率,应用于PM2.5暴露研究识别相关菌属。

详情
AI中文摘要

环境暴露(如空气污染)对人类健康的影响日益受到重视。越来越多的证据表明,微生物组可能介导这些效应,从而解释环境与宿主生物学之间的关系。然而,环境暴露对微生物组的影响尚未完全明确,且该背景下的统计建模面临复杂依赖结构的挑战。具体而言,微生物组数据在采样区域间表现出空间依赖性,以及微生物分类群间的生态相关性,若忽略这些依赖,会显著降低检测能力,导致遗漏真实信号。我们提出了一种新颖的微生物组数据空间混合建模框架,该框架利用条件自回归先验同时考虑区域级空间依赖和分类群级生态依赖。通过模拟,我们证明该框架优于忽略此类依赖的现有方法,在特征选择中实现高检测功率,同时保持低假阳性率并降低估计均方误差。应用于两项真实研究——食品与微生物组纵向调查研究数据和肺微生物组数据集,其中涉及细颗粒物(PM2.5)暴露,我们的模型识别出已知与污染相关健康结果有关的菌属,以及可能介导宿主对空气污染反应的新分类群。这一新颖方法为揭示复杂环境数据中具有生物学意义的关联提供了强大而灵活的工具。

英文摘要

The influence of environmental exposures, such as air pollution, on human health has become increasingly recognized. A growing body of evidence suggests that the microbiome may mediate these effects, explaining the relationship between the environment and host biology. However, the impact of environmental exposures on the microbiome is not yet fully understood, and statistical modeling in this context is challenged by complex dependency structures. In particular, microbiome data exhibit spatial dependencies across sampling regions as well as ecological correlations among microbial taxa, which, if ignored, can substantially reduce detection power, leading to missed true signals. We introduce a novel spatial mixed modeling framework for microbiome data that accounts for both region-level spatial dependency and taxon-level ecological dependency using conditional autoregressive priors. Through simulations, we demonstrate that this framework outperforms existing methods that ignore such dependencies, by achieving high detection power in feature selection while maintaining low false positive rates and reduced mean squared error in estimation. Applied to two real studies-data from Food and Microbiome Longitudinal Investigation study and lung microbiome dataset-with fine particulate matter (PM_2.5) exposures, our model identified genera, which are known to be involved in pollution-related health outcomes, as well as novel taxa that may mediate host responses to air pollution. This novel approach offers a powerful and flexible tool for uncovering biologically meaningful associations in complex environmental data.

2606.17841 2026-06-17 stat.ME 新提交

Subgroup analysis in randomized controlled trials with binary outcomes: dilution and logic-respecting properties

二元结局随机对照试验中的亚组分析:稀释与逻辑一致性性质

Long-Hao Xu, Yang Han, Tim Friede

AI总结 研究二元结局随机对照试验中亚组分析的比值比和相对响应的性质,证明比值比不适合作为疗效指标而相对响应合适,并阐明两者在逻辑一致性和稀释性质上的差异。

详情
AI中文摘要

亚组分析在随机对照试验中常规用于检验治疗效果在患者亚组间是否同质或由于治疗效应异质性而不同。本文研究了二元结局亚组分析中比值比和相对响应的性质,通过新的理论见解和方法学发展扩展了先前的工作。我们建立了几个新定理,描述了当两个亚组合并时,总体人群的比值比在大小和方向上如何变化。这些结果进一步证实了比值比不适合作为该亚组设置中的疗效指标,而相对响应是合适的。我们还提出了比值比和相对响应之间的正式关系,并阐明了它们在逻辑一致性性质(即总体疗效是否介于亚组疗效之间)和稀释性质(即混合亚组是否使总体比值比向1移动)方面的差异。尽管比值比通常不具有逻辑一致性,但在某些条件下它可能近似表现为具有逻辑一致性的疗效指标。为了说明我们的发现,我们基于临床试验数据给出了一个说明性示例,并讨论了其对随机对照试验中亚组分析的意义。

英文摘要

Subgroup analysis is routinely used in randomized controlled trials to examine whether treatment effects are homogeneous across patient subgroups or differ because of treatment-effect heterogeneity. In this paper, we investigate the properties of the odds ratio and the relative response in subgroup analyses with binary outcomes, extending previous work with new theoretical insights and methodological developments. We establish several new theorems that characterize how the odds ratio for the overall population changes in both magnitude and direction when two subgroups are combined. These results further confirm that the odds ratio is inappropriate as an efficacy measure in this subgroup setting, whereas the relative response is appropriate. We also present the formal relationship between the odds ratio and the relative response, and clarify their differences in terms of the logic-respecting property, that is, whether the overall efficacy lies between the subgroup efficacies, and the dilution property, that is, whether mixing subgroups moves the overall odds ratio toward 1. Although the odds ratio is generally not logic-respecting, it may behave approximately like a logic-respecting efficacy measure under certain conditions. To illustrate our findings, we present an illustrative example based on clinical trial data and discuss its implications for subgroup analysis in randomized controlled trials.

2606.17723 2026-06-17 stat.AP 新提交

Tail Dependence in EU Carbon Markets: Graphical Models of Extremes for EUA Futures

欧盟碳市场中的尾部依赖:EUA期货的极值图模型

Jan Maciejowski, Manuele Leonelli

AI总结 应用Hüsler-Reiss极值图模型分析EU ETS第三、四阶段20个日度变量,发现尾部网络比平均依赖网络更密集、中心节点不同,且EUA期货在尾部网络中中心性最高,而股指和外汇对则相反。

详情
AI中文摘要

理解极端价格波动如何在金融和能源市场间传播,对于欧盟排放交易体系(EU ETS)的风险管理和监管设计至关重要。我们将Hüsler-Reiss极值图模型应用于一个包含20个日度变量的系统,这些变量围绕EU ETS第三和第四阶段(2013-2025年)的EUA期货,并以高斯图模型作为平均依赖基线。尾部网络在结构上与平均依赖网络截然不同:密度显著更高,围绕不同的中心节点组织,并受部门内同质性支配,这种同质性比平均依赖水平更紧密地约束了部门边界。EUA期货在标准图模型中处于边缘位置,但在尾部网络中达到最高中心性,而股指和主要外汇对则呈现相反趋势。指数随机图模型确认了所有样本期内尾部网络中股票和外汇的边缘性,并识别出市场下行期间的三角闭合是第三阶段的现象,在第四阶段消失。阶段转变重构了尾部网络而未使其稀疏化:平均依赖急剧收缩,而尾部依赖持续存在,崩溃传染从聚集传播转变为扩散传播。这些发现对合规实体的对冲构建、监管机构的压力测试校准以及EU ETS市场系统性风险监测工具的设计具有直接意义。

英文摘要

Understanding how extreme price movements propagate across financial and energy markets is critical for risk management and regulatory design in the EU Emissions Trading System (EU ETS). We apply Hüsler-Reiss graphical models of extremes to a system of 20 daily variables centred on EU allowances futures across Phases 3 and 4 of the EU ETS (2013--2025), with a Gaussian graphical model as the average-dependence baseline. The tail networks are structurally distinct from the average dependence network: substantially denser, organized around different central nodes, and governed by within-sector homophily that binds sector boundaries more tightly than at the average-dependence level. EU allowances futures are peripheral in the standard graphical model but achieve the highest centrality in the tail networks, while equity indices and major FX pairs follow the opposite trajectory. Exponential random graph models confirm equity and FX peripherality in tail networks across all sample periods and identify triadic closure during market downturns as a Phase~3 phenomenon that vanishes in Phase~4. The phase transition restructures the tail network without thinning it: average dependence contracts sharply while tail dependence persists, and crash contagion shifts from clustered to diffuse propagation. These findings have direct implications for hedge construction by compliance entities, stress-test calibration by regulators, and the design of systemic-risk monitoring tools for EU ETS markets.

2606.17717 2026-06-17 stat.ME stat.AP 新提交

Double zero-inflated spatio-temporal modeling of daily precipitation under detection thresholds

检测阈值下日降水量的双零膨胀时空建模

Juan Marcen-Gutierrez, Jorge Castillo-Mateo, Alan E. Gelfand, Jesús Asín, Ana C. Cebrián

AI总结 针对日降水量中两种零值(无降水事件和低于检测限的未测量降水)问题,提出结合Probit回归、Gamma回归和阈值截断观测机制的多层时空模型,并应用高斯过程捕捉空间依赖,在贝叶斯框架下实现精确推断。

Comments 38 pages (+33 pages supplement), 7 figures (+35 figures supplement), 5 tables

详情
AI中文摘要

解释日尺度降水行为对于精细理解降水驱动机制至关重要。然而,由于零值的频繁出现,这一工作具有挑战性。两种类型的零值——作为干旱事件的无降水和由于检测限导致的未测量降水——的公认存在加剧了这一挑战。在这项工作中,我们提出了一个多层时空模型,该模型允许我们区分和解释两种类型的零值,并对高于检测限的正降水进行建模。该方法结合了通过Probit回归建模概率的零处点质量、潜在正降水量的Gamma回归以及受阈值截断影响的观测机制。为了捕捉空间依赖性,在每个回归模型中采用了高斯过程。在贝叶斯框架下工作,我们可以获得具有精确不确定性的丰富推断范围。特别是,我们提供了基于模型的推断工具,以比较和量化真实降水过程与其观测对应物在相关特征上的差异。我们将模型应用于西班牙东北部埃布罗河流域70个站点15年间的春季日观测数据分析。我们的发现表明,阈值强烈影响观测降水的发生,特别是在湿润地区。虽然其对总累积量的影响较小,但它可能对上分位数产生显著影响。

英文摘要

Explaining precipitation behavior at daily scale is important for fine scale understanding of the mechanisms driving precipitation. However, this effort is challenging because of the frequent incidence of zeros. The challenge is amplified by the acknowledged incidence of two types of zeros -- absence of precipitation as a dry event and absence of measured precipitation due to detection limits. In this work, we propose a multilevel spatio-temporal model which allows us to distinguish and explain the two types of zeros, as well as to model positive precipitation above the detection limit. The methodology combines a point mass at zero with probability modeled through a probit regression, a Gamma regression for latent positive precipitation amounts, and an observation mechanism subject to threshold-induced censoring. To capture spatial dependencies, Gaussian processes are employed in each regression model. Working within a Bayesian framework, we can obtain a rich range of inference with exact uncertainty. In particular, we provide model-based inference tools to compare and quantify differences between the true precipitation process and its observed counterpart across relevant characteristics. We apply our model to the analysis of daily spring observations at 70 sites over 15 years from the Ebro River Basin in northeastern Spain. Our findings indicate that the threshold strongly affects the occurrence of observed precipitation, especially in humid regions. While its impact on total accumulated amounts is small, it can exert a relevant effect on upper quantiles.

2606.17515 2026-06-17 stat.ME stat.ML 新提交

Anytime-valid Optimal Policy Identification

任意有效的最优策略识别

Daniel Molitor

AI总结 针对日志化情境赌博数据,提出一种任意有效框架,通过构建高概率包含真实最优策略集的时间索引集,支持连续监测和自适应停止,并给出样本复杂度界。

Comments 15 pages, 3 figures

详情
AI中文摘要

我们开发了一个用于从日志化情境赌博数据中识别最优策略的任意有效框架。在许多应用场景中,分析者希望从候选策略类 $\Pi$ 中选择最优策略,但数据由外部确定的日志策略生成,分析者无法控制。分析者也可能希望连续监测证据,一旦最优策略明确就停止,而不是事先承诺固定样本量。本文通过构建一个时间索引集 $S_t$ 来解决这些挑战,该集合以高概率随时间一致地保留真实最优策略集。由此产生的程序允许分析者监测策略值、消除明显次优策略,并在数据依赖的时间停止而不使推断失效。当最优策略唯一时,我们定义了其识别的停止时间,并推导出样本复杂度界为 $O\\!\left(\frac{\log |\Pi|+\log\log(1/\Delta_{\min})}{\Delta_{\min}^2}\right)$,其中 $\Delta_{\min}$ 是最优与次优策略值之间的差距。模拟表明,相对于固定样本量设计,任意有效方法可以节省大量样本。应用于一个减少在线错误信息的大型自适应实验,说明了该方法如何在最优策略证据积累时提供动态视图。

英文摘要

We develop an anytime-valid framework for optimal policy identification from logged contextual bandit data. In many applied settings, the analyst wants to select the optimal policy from a candidate policy class $Π$, but data are generated by an externally determined logging policy that they do not control. The analyst may also wish to monitor evidence continuously, stopping once the optimal policy is clear rather than committing to a fixed sample size in advance. This paper addresses these challenges by constructing a time-indexed set $S_t$ that retains the true optimal policy set uniformly over time with high probability. The resulting procedure allows the analyst to monitor policy values, eliminate clearly suboptimal policies, and stop at data-dependent times without invalidating inference. When the optimal policy is unique, we define a stopping time for its identification and derive a sample-complexity bound scaling as $O\!\left(\frac{\log |Π|+\log\log(1/Δ_{\min})}{Δ_{\min}^2}\right)$, where $Δ_{\min}$ is the gap between the best and second-best policy values. Simulations demonstrate that the anytime-valid approach can yield substantial sample savings relative to fixed-$N$ designs. An application to a large adaptive experiment on reducing misinformation online illustrates how the method provides a dynamic view as evidence on the optimal policy accumulates.

2606.17486 2026-06-17 stat.ME stat.CO 新提交

Improving Linear Regression on Small Datasets via Gaussian Process and Extreme Value Theory-Based Data Augmentation

基于高斯过程和极值理论的数据增强改进小样本线性回归

Ibrahim Salay, Jagath Senarathne

AI总结 针对小样本回归中经典假设违背问题,提出GP-MEVT混合数据增强方法,结合高斯过程与极值理论扩展预测空间并保留线性结构,在模拟和真实数据上优于标准bootstrap方法。

详情
AI中文摘要

小样本量在回归分析中带来显著挑战,常导致正态性、同方差性和残差独立性等经典假设的违背。这些违背损害了参数估计的准确性,降低了统计功效,并限制了结果的泛化能力。本研究引入了基于高斯过程的改进极值定理(GP-MEVT)方法,这是一种新颖的混合数据增强方法,结合了高斯过程与极值理论以解决这些局限性。GP-MEVT方法生成增强观测值,将预测空间扩展到观测范围之外,同时保留底层线性结构,并根据残差变异引入受控变异性。通过在三个方差场景(sigma = 2, 5, 8)和样本量(n = 10, 15, 20)下的全面模拟研究,我们证明GP-MEVT实现了更高的假设满足率,显著优于标准bootstrap和带噪声的bootstrap方法。所提出的方法还表现出合理的参数估计准确性,截距和斜率估计值始终更接近真实参数值,并且在均方根误差衡量下保持竞争性或更优的模型拟合性能。应用于真实世界数据集证实了这些优势,GP-MEVT实现了67.1%的假设满足率,而bootstrap替代方法分别为17.3%和21.2%。这些发现确立了GP-MEVT作为拟合小数据集线性回归模型的稳健可靠框架,为实践者在样本量限制不可避免时提供了一种原则性的统计推断方法。

英文摘要

Small sample sizes pose significant challenges in regression analysis, often leading to violations of classical assumptions such as normality, homoscedasticity, and independence of residuals. These violations compromise parameter estimation accuracy, reduce statistical power, and limit the generalizability of findings. This study introduces the Gaussian Process-based Modified Extreme Value Theorem (GP-MEVT) method, a novel hybrid data augmentation approach that combines Gaussian Process with Extreme Value Theory to address these limitations. The GP-MEVT method generates augmented observations that extend the predictor space beyond the observed range while preserving the underlying linear structure and introducing controlled variability based on residual variation, through comprehensive simulation studies across three variance scenarios (sigma = 2, 5, 8) and sample sizes (n = 10, 15, 20). Here, we demonstrate that GP-MEVT achieves a higher rate of assumption satisfaction, substantially outperforming standard bootstrap and bootstrap with noise methods. The proposed method also exhibits reasonable parameter estimation accuracy, with intercept and slope estimates consistently closer to true parameter values, and maintains competitive or superior model fitting performance as measured by root mean square error. Application to a real-world dataset confirms these advantages, with GP-MEVT achieving a 67.1% assumption satisfaction rate compared to 17.3% and 21.2% for bootstrap alternatives. These findings establish GP-MEVT as a robust and reliable framework for fitting linear regression models to small datasets, offering practitioners a principled approach to statistical inference when sample size limitations are unavoidable.

2606.17424 2026-06-17 stat.ME 新提交

The dangers of using three-number summaries to estimate unknown standard deviations: sensitivity analyses and some possible improvements incorporating shape

使用三数汇总估计未知标准差的风险:敏感性分析及结合形状信息的改进方法

Udara Kumaranathunga, Alysha De Livera, Luke A. Prendergast

AI总结 本文揭示三数汇总(最小值、中位数、最大值)不足以可靠估计标准差,提出基于缩放Beta分布的新估计器,并开发敏感性分析工具以提高推断可靠性。

详情
AI中文摘要

近年来,将三数和五数汇总统计量(即最小值、最大值、中位数和四分位数)转换为均值和标准差的方法取得了很大进展。这在元分析中很常见,其中一些研究报告均值和标准差,而另一些报告分位数汇总。然而,我们表明,最常见的三数汇总不包含足够的信息来可靠地估计标准差。我们证明,这可能导致非常差的估计,从而可能使任何推断无效,并提供了敏感性分析的细节,使研究人员能够对其结果更有信心,或突出潜在的偏差来源。我们进一步探讨了指定额外信息是否能提供关于未知数据形状的足够信息以改进标准差估计,并在此过程中引入了一种使用缩放Beta分布的新估计器。通过模拟和真实数据示例,我们突出了该方法的优缺点。还提供了一个Web应用程序,以帮助研究人员进行敏感性分析。

英文摘要

In recent years, there has been much progress toward the development of methods for converting three- and five-number summary statistics (i.e. minimum, maximum, median, and quartiles) to means and standard deviations (SDs). This is commonly done in the meta-analysis setting, where some studies report means and SDs, while other report quantile summaries. However, we show that three-number summaries, which are the most common, do not contain enough information to reliably estimate SDs. We show that very poor estimates can result, which may invalidate any inference and provide details of a sensitivity analysis that can allow researchers to have greater confidence in their results, or highlight potential sources of bias. We further explore whether nominating additional information can provide enough information regarding the unknown data shape to improve SD estimations, and in doing so introduce a new estimator using the scaled Beta distribution. Simulations and a real data example are used to highlight the advantages and disadvantages of this approach. A Web application is also provided to help researchers perform sensitivity analyses.

2606.17232 2026-06-17 stat.ME 新提交

Semiparametric Mediation Analysis with Separately Observed Mediator and Outcome under Unmeasured Confounding

存在未测量混杂时基于分别观测的中介变量和结局变量的半参数中介分析

Sijia Li, Ruoyu Wang

AI总结 针对中介变量和结局变量从未同时观测的数据不完整性,提出一种数据融合框架,利用共享工具变量在无交互条件下识别自然直接和间接效应,并开发具有多重稳健性的半参数影响函数估计器。

Comments 24 pages; 2 figures

详情
AI中文摘要

中介分析被广泛用于解构因果路径,然而在许多实际研究中,中介变量 M 和结局变量 Y 从未被同时观测。这种不完整性破坏了自然直接和间接效应的标准识别策略。我们引入了一种新颖的数据融合框架,通过结合两个不完整的数据源(一个测量 M,另一个测量 Y)来恢复识别。我们的方法利用共享工具变量(IVs)来规避联合观测 (M,Y) 的需求,在无交互条件下对未测量混杂仍然有效,并通过潜在对齐条件适应跨数据源的协变量和暴露偏移。我们建立了两种识别策略:一种适用于已知有效 IV 集合的场景,另一种适用于需要学习有效 IV 的场景。我们进一步开发了具有多重稳健性的半参数影响函数估计器,并提出了一个在适当条件下达到半参数效率界的估计器。我们将我们的框架应用于量化 SNP rs610932 对痴呆风险的影响在多大程度上通过免疫相关基因表达途径中介。

英文摘要

Mediation analysis is widely used to disentangle causal pathways, yet in many real-world studies the mediator M and outcome Y are never jointly observed. This incompleteness breaks the standard identification strategy for natural direct and indirect effects. We introduce a novel data fusion framework that restores the identification by combining two incomplete data sources, one measuring $M$ and the other measuring Y. Our approach leverages shared instrumental variables (IVs) to circumvent the need to observe (M,Y) jointly, remains valid under unmeasured confounding via a no-interaction condition, and accommodates covariate and exposure shifts across data sources under a latent alignment condition. We establish two identification strategies, one for settings with a known set of valid IVs, and another for settings where valid IVs must be learned. We further develop semiparametric, influence-function-based estimators with multiple robustness properties, and propose an estimator that attains the semiparametric efficiency bound under appropriate conditions. We apply our framework to quantify the extent to which the effect of SNP rs610932 on dementia risk is mediated through immune-related gene-expression pathways.

2606.17181 2026-06-17 stat.ME stat.AP 新提交

Tropical Viterbi Tubes for Decoding Uncertainty in Hidden Markov Models

热带维特比管:隐马尔可夫模型解码不确定性

Aurélien Nicosia

AI总结 提出热带维特比管,通过容忍度阈值捕获隐马尔可夫模型中接近最优的路径不确定性,并给出精确投影算法与校准方法。

Comments 33 pages, 4 figures; supplementary material included as ancillary file; submitted to The Annals of Applied Statistics

详情
AI中文摘要

隐马尔可夫模型广泛用于从序列数据推断潜在状态序列,但维特比解码仅报告一条最可能的完整路径。当解码状态具有科学意义时,这一单一最大化器可能掩盖由多条近最优轨迹产生的路径不确定性。在拟合的HMM条件下,我们引入热带维特比管:其完整数据对数得分在维特比最优值容忍度内的隐藏轨迹集合。状态、转移和变化状态投影显示哪些局部特征与全局近最优完整路径兼容,为序列分析、生态学、金融、生物医学监测及相关领域的HMM提供了路径不确定性层。该管是完整隐藏路径空间上的后验上水平集,容忍度解释为相对于维特比路径的对数后验几率损失。将容忍度校准到目标后验质量,为完整潜在路径提供了HPD阈值可信区域和保守的同时投影带。我们证明了单调性、阶梯函数行为和确定性稳定性保证,并通过最大加前向-后向递归在O(TK^2)时间内精确计算密集转移的投影管。后验管质量和HPD校正是通过FFBS近似的独立路径计算。在一个公开的蝙蝠追踪应用中,鲁棒觅食管段富含捕食嗡嗡声,而鲁棒通勤管段则缺乏:在eta=0.005时,鲁棒觅食的富集度为2.25,95%自助法区间为(1.73, 2.85);鲁棒通勤的富集度为0.27,区间为(0.16, 0.44)。

英文摘要

Hidden Markov models are widely used to infer latent state sequences from sequential data, but Viterbi decoding reports only one most likely complete path. When decoded states carry scientific meaning, this single maximizer can conceal pathwise uncertainty created by multiple near-optimal trajectories. Conditional on a fitted HMM, we introduce the tropical Viterbi tube: the set of hidden trajectories whose complete-data log-score lies within a tolerance of the Viterbi optimum. State, transition, and change-status projections show which local features remain compatible with globally near-optimal complete paths, giving a pathwise uncertainty layer for HMMs in sequence analysis, ecology, finance, biomedical monitoring, and related domains. The tube is a posterior superlevel set on complete hidden-path space, with tolerance interpreted as a log posterior-odds loss relative to a Viterbi path. Calibrating the tolerance to a target posterior mass gives an HPD-threshold credible region for the complete latent path and conservative simultaneous projected bands. We prove monotonicity, step-function behavior, and deterministic stability guarantees, and compute projected tubes exactly by max-plus forward-backward recursions in O(TK^2) time for dense transitions. Posterior tube mass and HPD calibration are separate pathwise calculations approximated by FFBS. In a public bat-tracking application, robust foraging tube segments are enriched for feeding buzzes, whereas robust commuting segments are depleted: at eta = 0.005, enrichment is 2.25 with 95% bootstrap interval (1.73, 2.85) for robust foraging and 0.27 with interval (0.16, 0.44) for robust commuting.

2606.18087 2026-06-17 econ.GN q-fin.EC 新提交

Environmental Threat and the Nation: Earthquake Risk, Distributive Priority, and Expressive Attachment

环境威胁与国家:地震风险、分配优先级与表达性依恋

Hector Galindo-Silva

AI总结 利用全球63个国家494个地区的数据,研究发现长期地震风险增强国家认同,主要通过表达性渠道(自豪感、战斗意愿)而非分配性渠道,且该效应在宗教象征基础设施完备的地区更显著。

详情
AI中文摘要

本文研究长期地震风险如何塑造国家认同,区分了分配性边际(国家成员身份作为稀缺资源分配规则)和表达性边际(自豪感、战斗意愿和情感依恋)。将世界价值观调查受访者(1981-2022年;63个国家,494个次国家地区)与次国家地震风险地理数据关联,我发现居住在高风险区域附近的人表现出更强的国家内群体取向:更多的自豪感、更强的战斗意愿,以及在就业稀缺时给予国民更多优先权。家庭依恋和外群体敌意并未上升,而宗教虔诚度同步增加。表达性边际是有条件的:在政教合一且宗教领域凝聚力强的地方,自豪感反应显著,因为这种象征性基础设施将灾难塑造为共同的国家考验;而在缺乏这些条件的地方,自豪感反应与零无显著差异。利用相邻调查波次之间地震的补充设计发现,平均短期反应为零,但检测到的反应集中在年长、对地方有依恋且无法离开的居民中——这与态度追踪长期、不可避免的风险而非单一事件相一致。综合来看,结果指向国家依恋的需求侧起源:当协变量冲击会压倒地方和家庭保险时,人们转向更大的保护与意义共同体——国家和宗教——这一逻辑我在一个简单的社会互动模型中形式化。

英文摘要

This paper studies how long-run earthquake risk shapes national identity, separating a distributive margin (national membership as a rule for allocating scarce resources) from an expressive margin (pride, willingness to fight, and affective attachment). Linking World Values Survey respondents (1981-2022; 63 countries, 494 subnational regions) to subnational seismic-risk geography, I find that people living closer to high-risk zones express stronger national in-group orientation: more pride, more willingness to fight, and more priority for nationals when jobs are scarce. Family attachment and out-group hostility do not rise, while religiosity increases in parallel. The expressive margin is conditional: the pride response is pronounced where state-religion alignment and a cohesive religious field lend the symbolic infrastructure to cast disaster as a shared national ordeal, and indistinguishable from zero where they do not. A complementary design exploiting earthquakes between adjacent survey waves finds no average short-run response, yet the response it does detect concentrates among older, place-attached residents who cannot leave -- consistent with attitudes tracking a chronic, inescapable risk rather than single events. Together, the results point to a demand-side origin of national attachment: where a covariate shock would overwhelm local and family insurance, people turn to larger communities of protection and meaning -- the nation and religion -- a logic I formalize in a simple social-interaction model.

2606.17807 2026-06-17 econ.GN q-fin.EC 新提交

Household coping mechanisms under grid failure: Evidence from a high electrification context in Lebanon

电网故障下的家庭应对机制:黎巴嫩高电气化背景下的证据

Majd Olleik, Haytham M. Dbouk, Anne Neumann, Elsa Bou Gebrael, Sebastian Zwickl-Bernhard

AI总结 基于黎巴嫩1000户家庭调查数据,研究家庭在电网故障下通过柴油发电机和光伏电池系统等供给侧应对机制及需求侧适应行为,揭示社会经济地位对应对方案获取和需求满足程度的关键影响。

Comments Submitted to a peer-reviewed journal

详情
AI中文摘要

尽管许多国家实现了近乎普遍的电气化,但电力供应短缺仍然影响着家庭能源使用。本文以黎巴嫩为案例,研究家庭如何适应高电气化、高依赖背景下的慢性电网故障。基于1000户家庭的原始调查数据,我们分析了供给侧应对机制(如柴油发电机和太阳能光伏-电池系统)以及需求侧适应措施(包括负荷转移和需求抑制)。结果揭示了家庭应对的全景图,其中社会经济地位在决定备用解决方案的获取和需求满足程度方面起着核心作用。虽然柴油发电机仍然普遍,但观察到向光伏-电池系统的转变,尤其是在经济能力较强的家庭中。然而,分散式自发电伴随着效率低下,包括大量弃光。在需求侧,家庭表现出用电量减少,导致根据所采用的备用系统类型出现不同的消费模式。这些发现强调了在评估不可靠供应下的能源需求时,区分满足和未满足需求的重要性。本文通过定量描述供应受限的高电气化背景下自发电与需求适应之间的相互作用,为文献做出了贡献。它还提供了包含抑制消费的经验需求曲线,填补了电力系统规划中的一个关键空白。从政策角度来看,结果强调需要核算未满足需求,解决应对技术获取中的不平等问题,并减少分散式系统的低效率。

英文摘要

Despite near-universal electrification in many countries, electricity supply shortages continue to shape household energy use. This paper examines how households adapt to chronic grid failure in high-electrification, high-dependence contexts, using Lebanon as a case study. Drawing on original survey data from 1,000 households, we analyze both supply-side coping mechanisms such as diesel generators and solar photovoltaic (PV)-battery systems, and demand-side adaptations, including load shifting and demand suppression. The results reveal a landscape of household responses, where socioeconomic status plays a central role in determining access to backup solutions and the extent of met demand. While diesel generators remain widespread, a transition toward PV-battery systems is observed, especially among financially capable households. However, decentralized self-generation is associated with inefficiencies, including substantial levels of curtailed solar generation. On the demand side, households exhibit reductions in electricity use, leading to distinct consumption profiles depending on the type of backup system employed. These findings highlight the importance of distinguishing between met and unmet demand when assessing energy needs under unreliable supply. The paper contributes to the literature by providing a quantitative characterization of the interaction between self-generation and demand adaptation in a supply-constrained high-electrification context. It also offers empirical demand profiles that incorporate suppressed consumption, addressing a key gap in electricity system planning. From a policy perspective, the results underscore the need to account for unmet demand, address inequities in access to coping technologies, and reduce inefficiencies in decentralized systems.

2606.17503 2026-06-17 econ.GN q-fin.EC 新提交

What Prediction Markets Can See: Market Formation, Settlement Legibility, and the Geography of Tradable Uncertainty in Africa and Latin America

预测市场能看见什么:市场形成、结算可读性以及非洲和拉丁美洲可交易不确定性的地理分布

Ade Adegbenro

AI总结 通过分析Polymarket和Kalshi上6047个非洲和拉丁美洲主题合约,构建结算可读性指标,发现市场形成具有选择性,体育和选举合约居多,而重要公民事件合约稀缺,且可读性预测合约上市方向但未达预设标准。

Comments 45 pages

详情
AI中文摘要

预测市场通常在其合约存在后通过评估价格预测结果的准确性来评价。我们研究市场形成的制度性前置条件,探究哪些不确定性能够成为可交易合约。利用Polymarket和Kalshi上列出的6047个非洲和拉丁美洲主题合约的审计数据集,我们构建了一个结算可读性的编码度量,即不确定性能够被第三方措辞、引用和可信解决的程度,并在冻结编码本下对451个单元进行验证,独立双重评分在主要维度上达到0.92和0.96的序数可靠性,盲人基准分别达到0.97和0.92。利用这一度量,我们发现市场形成具有选择性,而公众重要性无法解释这种选择性:非洲合约主要集中在足球领域,而显著的公民事件几乎不产生合约;拉丁美洲合约更深,但以委内瑞拉为主,对美国潜在军事行动的关注支撑了数据中最大的公民事件集群。可读性对合约库存进行陡峭排序,体育和选举位于量表顶端,冲突位于底部。在针对外部构建的131个公民事件框架的形成测试中,可读性按预期方向预测上市,但未达到预先指定的接受标准;而在已上市合约中,可读性与交易价值呈负相关,这与选择性上市模型的预测以及我们在估计前的预测一致。因此,预测市场库存衡量的是平台能够结算的内容,而非交易者相信的内容,将其解读为公众兴趣地图会混淆两者。

英文摘要

Prediction markets are usually evaluated after their contracts exist, by asking how well prices forecast outcomes. We study the prior institutional margin of market formation, asking which uncertainties become tradable contracts at all. Using an audited dataset of 6,047 Africa-topic and Latin America-topic contracts listed on Polymarket and Kalshi, we construct a coded measure of settlement legibility, the degree to which an uncertainty can be worded, sourced, and credibly resolved by third parties, and validate it on 451 units under a frozen codebook, where independent double scoring reaches ordinal reliabilities of 0.92 and 0.96 on the primary dimensions and blind human benchmarks reach 0.97 and 0.92. Using this measure, we find that formation is selective in ways that public importance does not explain, with African inventory concentrated overwhelmingly in football while salient civic events produce little or no inventory, and Latin American inventory deeper but dominated by Venezuela, where attention to prospective United States military action sustains the largest civic cluster in the data. Legibility orders the inventory steeply, with sports and elections near the top of the scale and conflict at the bottom. In a formation test against an externally assembled frame of 131 civic events, legibility predicts listing in the expected direction but falls short of pre-specified acceptance criteria, while among listed contracts the relation between legibility and trading value is negative, as a model of selective listing implies and as we predicted before estimation. Prediction-market inventories therefore measure what platforms can settle as much as what traders believe, and reading them as maps of public interest conflates the two.

2606.17423 2026-06-17 q-fin.CP stat.ML 新提交

Martingale Doppelgänger-Eval: An Identification Framework for Auditing Candlestick Understanding in Vision-Language Models

鞅双生评估:审计视觉语言模型对K线图理解的识别框架

Ziyao Wang

AI总结 提出Martingale Doppelgänger-Eval基准,通过受控实验识别VLM是否基于K线证据而非趋势外推进行判断,发现模型忽略或反向利用K线语义。

详情
AI中文摘要

我们引入了Martingale Doppelgänger-Eval,一个公开的影子市场基准,用于审计视觉语言模型(VLM)是否使用K线证据而非外推过去趋势。核心困难在于识别:在真实市场历史中,图表证据和趋势高度耦合,因此观测得分无法确定流畅的技术分析叙述是否基于局部视觉证据。我们形式化证明了这一局限性:在强耦合下,没有基于观测的图表-标签数据计算的评估函数能够区分基于证据的响应者和基于趋势捷径的响应者,而匹配的证据干预以指数速率区分相同的响应者,趋势-标签交换提供了独立的捷径压力测试。因此,该基准在四种受控机制下评估冻结的VLM:鞅零市场、注入阿尔法的反事实对、趋势混杂交换和制度转换。结构行为模型识别了零市场偏差、趋势敏感性、证据敏感性、提示/渲染器脆弱性和证据忠实性;附带的统计工具包提供了最小可检测效应、针对计量API的块感知序贯测试以及重叠加权伪影检查。在冻结的商业和开源VLM中,识别回归将大的正系数分配给过去趋势,但证据系数为零或与规则隐含符号相反。匹配对分析表明,模型要么忽略注入的K线语义,要么在响应时朝与规则隐含方向相反的方向移动。该基准隔离了标准观测图表基准无法检测的失败模式,并为具有可控标签机制的时间序列图像提供了可复用的审计模板。

英文摘要

We introduce Martingale Doppelgänger-Eval, a public shadow-market benchmark for auditing whether vision-language models (VLMs) use candlestick evidence rather than extrapolate past trends. The central difficulty is identification: on real market histories, chart evidence and trend are strongly coupled, so an observational score cannot determine whether a fluent technical-analysis narrative is grounded in local visual evidence. We prove this limitation formally: no evaluation functional computed from observational chart--label data can distinguish a grounded responder from a trend-shortcut responder under strong coupling, whereas matched evidence interventions separate the same responders at an exponential rate and trend--label swaps provide an independent shortcut stress test. The benchmark therefore evaluates frozen VLMs on rendered OHLCV charts under four controlled mechanisms: a martingale-null market, injected-alpha counterfactual pairs, trend-confounder swaps, and regime shifts. A structural behavioral model identifies null-market bias, trend sensitivity, evidence sensitivity, prompt/renderer fragility, and evidence faithfulness; the accompanying statistical toolkit provides minimum detectable effects, block-aware sequential testing for metered APIs, and an overlap-weighted artifact check. Across frozen commercial and open VLMs, the identified regression assigns large positive coefficients to past trend but evidence coefficients that are zero or opposite to the rule-implied sign. Matched-pair analyses show that models either ignore injected candlestick semantics or move opposite to the rule-implied direction conditional on responding. The benchmark isolates a failure mode that standard observational chart benchmarks cannot detect and gives a reusable audit template for time-series imagery with controllable label mechanisms.

2606.17373 2026-06-17 econ.GN q-fin.EC 新提交

Some General Remarks on Private Property

关于私有财产的一些一般性评论

Adnan N. Alabbar, Walter E. Block

AI总结 本文从社会、法律和经济角度分析私有财产,遵循洛克传统,聚焦于首次使用无主物的获取行为,并探讨财产定义中的开放纹理问题及洛克体系中成为财产的必要条件。

Comments 46 pages

详情
AI中文摘要

私有财产是文明社会的核心制度之一。我们首先考虑其社会、法律和经济方面。然后遵循洛克传统,关注一个特定的程序性定义:先占(Homesteading)是指首次使用一个最初无主的对象的获取行为。具体对象的本体论及其使用方式决定了对象如何被获取。在本文中,我们处理财产定义中的开放纹理问题,然后提供在洛克体系中一个对象成为财产的必要条件。

英文摘要

Private Property is one of the central institutions of civilized society. We first consider its social, legal, and economic aspects. We then follow the Lockean tradition by focusing on a specific procedural definition: Homesteading is the acquisitive act of first using an object that is initially unowned. The ontology of concrete objects and the nature of their uses determine how objects may be acquired. In this article, we address the open-texture problem in the definition of property, then provide the necessary conditions for an object to be property in the Lockean Scheme.

2606.17290 2026-06-17 econ.GN q-fin.EC 新提交

Competing firms, competing regulators: The strategic cost of fragmented climate policy

竞争企业,竞争监管者:碎片化气候政策的战略成本

Nicole Adler, Gianmarco Andreana, Gerben de Jong

AI总结 本文通过两阶段博弈框架分析碎片化气候政策下企业反应与治理结构的互动,发现全球统一监管在对称市场中最优,但非对称市场中分散制度更优,且区域特定收费能实现最高福利但存在分配不均。

详情
AI中文摘要

全球网络产业的气候政策在碎片化的司法管辖区实施,但企业通过整合运营网络做出响应。我们开发了一个两阶段博弈理论框架,分析企业层面的反应如何与替代治理结构相互作用。监管者首先选择排放收费。企业随后通过定价、服务能力和资本部署决策进行竞争。分析结果表明,在对称市场中,统一的全球监管最大化福利。然而,在足够不对称的市场中,统一的全球收费不如分散制度。多种监管工具能更好地适应区域特定的市场外部性。我们将该框架应用于北美、西欧和跨大西洋航空市场的校准案例研究。数值结果表明,设定区域特定收费的全球协调监管者实现了最高的总福利。然而,这些总收益掩盖了跨司法管辖区的显著分配差异。因此,网络产业中有效的气候治理不仅需要确定高效的排放收费。政策工具应适应区域异质性,并且需要转移机制来确保高效、政治稳定的合作。

英文摘要

Climate policy in global network industries is implemented across fragmented jurisdictions, yet firms respond through integrated operational networks. We develop a two-stage game-theoretic framework to analyze how firm-level responses interact with alternative governance structures. Regulators first choose emissions charges. Firms subsequently compete through pricing, service capacity and capital deployment decisions. The analytical results demonstrate that uniform global regulation maximizes welfare in symmetric markets. However, in sufficiently asymmetric markets, a uniform global charge is dominated by decentralized regimes. Multiple regulatory instruments better accommodate region-specific market externalities. We apply this framework to a calibrated case study of North American, Western European and transatlantic aviation markets. The numerical results establish that a globally coordinated regulator setting region-specific charges achieves the highest aggregate welfare. These aggregate gains nonetheless mask substantial distributional disparities across jurisdictions. Effective climate governance in network industries therefore requires more than determining an efficient emissions charge. Policy instruments ought to accommodate regional heterogeneity and transfer mechanisms will be necessary to ensure efficient, politically stable cooperation.

2606.17079 2026-06-17 econ.GN econ.EM q-fin.EC 新提交

Partial Identification of Spatial Production Networks

空间生产网络的部分识别

Shaowen Luo, Kwok Ping Tsang, Zichao Yang

AI总结 针对公共数据无法观测跨州买卖关系的问题,利用运输线性规划计算线性暴露统计量的尖锐识别集,应用于美国州-部门数据发现货物运输数据与关键商品部门的空间扩散性不一致,但无法唯一识别区域生产网络或州对本地冲击的暴露排名。

详情
AI中文摘要

当公共数据无法观测跨州的买卖关系时,哪些区域暴露结论是可识别的?我们通过将缺失的中间投入空间核视为一个受区域活动边际、支撑限制和辅助运输矩约束的未知耦合来研究这一问题。对于线性暴露统计量,尖锐识别集通过运输线性规划计算。将该方法应用于美国州-部门数据,我们发现货物运输数据与关键商品部门中比例区域化所隐含的空间扩散性不一致。然而,它们并不能唯一识别区域生产网络或州对本地冲击暴露的精确排名。双边运输限制收紧了边界,但剩余的不确定性主要来自服务和大混合部门,这些部门在货物运输数据中覆盖较弱。结果表明,哪些暴露结论得到公共数据的支持,哪些是由维持的区域化假设所强加的。

英文摘要

Which regional exposure conclusions are identified when public data do not observe buyer-seller links across states? We study this question by treating the missing intermediate-input spatial kernel as an unknown coupling constrained by regional activity margins, support restrictions, and auxiliary shipment moments. For linear exposure statistics, the sharp identified set is computed by transportation linear programs. Applying the method to U.S. state-sector data, we find that shipment data are inconsistent with the spatial diffuseness implied by proportional regionalization in key goods sectors. However, they do not identify a unique regional production network or a precise ranking of state exposure to local shocks. Bilateral shipment restrictions tighten the bounds, but much of the remaining uncertainty comes from large service and mixed sectors that are weakly covered by goods-movement data. The results show which exposure conclusions are supported by public data and which are imposed by maintained regionalization assumptions.

2606.18058 2026-06-17 eess.IV q-bio.QM 新提交

Multiscale reconstruction of protein conformations from cryo-EM images

从冷冻电镜图像中多尺度重建蛋白质构象

David Y. W. Thong, Ozan Öktem, Joakim Andén

AI总结 提出一种多尺度算法,直接从单颗粒冷冻电镜数据恢复蛋白质原子模型,通过显式表示蛋白质主链的键、扭转角和键角,在噪声高、对比度低的数据上达到最先进精度,并提高RMSD和TM分数。

Comments 19 pages, 11 figures. Submitted to the Journal of Structural Biology

详情
AI中文摘要

我们提出了一种新颖的多尺度算法,用于从单颗粒冷冻电镜数据中直接恢复蛋白质的原子模型结构。我们的算法能够针对高噪声和低对比度的数据估计出达到最先进精度的蛋白质结构。它还对TEM图像形成模型中的错误指定具有鲁棒性。这些理想的特性主要归功于使用键、扭转角和键角对蛋白质主链进行显式表示,这为结构恢复过程提供了丰富的先验信息。我们将该方法应用于三个蛋白质冷冻电镜数据集(使用电子显微镜数字孪生产生),并表明使用多尺度方法相对于真实值在均方根偏差(RMSD)和模板建模(TM)分数上有所改进。此外,有证据表明多尺度算法优先考虑更大尺度的结构,这减少了收敛到不良局部极小值的可能性。

英文摘要

We present a novel multiscale algorithm for directly recovering the atomic model structure of a protein from single-particle cryo-EM data. Our algorithm is able to estimate protein structures to state-of-the-art accuracy for high-noise and low-contrast data. It is also robust to misspecifications in the TEM image formation model. These desirable properties are primarily due to the use of an explicit representation of the protein backbone in terms of bonds, torsion angles and bond angles, which supplies rich prior information to the structure recovery process. We apply our method on three protein cryo-EM datasets, generated using an electron microscope digital twin, and show that using a multiscale approach yields an improvement of the root-mean-square deviation (RMSD) and template modelling (TM) scores with respect to the ground truth. Furthermore, there is evidence that larger-scale structures are being prioritised with the multiscale algorithm, which reduces the possibility of convergence to bad local minima.

2606.18179 2026-06-17 q-bio.GN 新提交

PyPeakRankR: Reproducible Peak-Level Feature Extraction for Regulatory Element Ranking

PyPeakRankR:用于调控元件排序的可重现峰级特征提取

Saroja Somasundaram, Nelson J. Johansen, Trygve E. Bakken, Jeremy A. Miller

AI总结 提出PyPeakRankR开源Python包,从ATAC-seq峰中提取BigWig信号、GC含量、PhyloP保守性、分布矩和细胞类型特异性排名等特征,形成可重现的峰-特征矩阵,支持透明基准测试和跨组装评分,在BICCN挑战中排名前三。

Comments Software paper. Code: https://github.com/AllenInstitute/PeakRankR/tree/python-package. 6 pages, 1 figure

详情
AI中文摘要

高通量染色质可及性检测(如ATAC-seq)可生成数千个候选调控元件(峰),但目前尚无标准化工具来整合多种定量特征以优先选择峰进行功能验证。本文提出PyPeakRankR,一个开源Python包,它提取峰级特征,即BigWig信号汇总、GC含量、PhyloP保守性评分、分布矩(峰度、偏度、双峰性)和细胞类型特异性排名,并将其整合为一个可重现的峰×特征矩阵,以制表符分隔值(TSV)文件存储。PyPeakRankR将确定性特征提取与下游排序分离,使得在相同上游数据上对优先排序策略进行透明基准测试成为可能。该包提供命令行界面和匹配的Python API,支持通过liftOver进行跨组装评分,并在数分钟内处理数千个峰。PyPeakRankR在脑倡议细胞普查网络(BICCN)社区挑战中得到验证,其前身PeakRankR在16种方法中排名前三,用于细胞类型特异性增强子预测。在最近的一项基底神经节研究中,PyPeakRankR被用于跨物种增强子排序管道(CERP),以识别在多种细胞类型中实现超过70%靶向特异性的增强子-AAV工具。PyPeakRankR在MIT许可下免费提供,网址为https://github.com/example/PyPeakRankR。

英文摘要

High-throughput chromatin accessibility assays such as ATAC-seq generate thousands of candidate regulatory elements (peaks), yet no standardized tool exists for assembling the diverse quantitative features needed to prioritize peaks for functional validation. Here we present PyPeakRankR, an open-source Python package that extracts peak-level features, namely BigWig signal summaries, GC content, PhyloP conservation scores, distribution moments (kurtosis, skewness, bimodality), and cell-type specificity rankings, into a single reproducible peak by feature matrix stored as a tab-separated values (TSV) file. PyPeakRankR separates deterministic feature extraction from downstream ranking, enabling transparent benchmarking of prioritization strategies on the same upstream data. The package provides both a command-line interface and a matching Python API, supports cross-assembly scoring via liftOver, and runs in minutes on thousands of peaks. PyPeakRankR was validated in the Brain Initiative Cell Census Network (BICCN) community challenge, where its predecessor PeakRankR ranked among the top 3 of 16 methods for cell-type specific enhancer prediction. In a recent basal ganglia study, PyPeakRankR was used within the Cross-species Enhancer Ranking Pipeline (CERP) to identify enhancer-AAV tools achieving greater than 70% on-target specificity across cell types. PyPeakRankR is freely available under the MIT license at https://github.com/AllenInstitute/PeakRankR/tree/python-package.

2606.17745 2026-06-17 q-bio.NC 新提交

Separating wiring-specific from statistical control of dynamics in a complete connectome

在完整连接组中分离接线特定与统计控制对动力学的影响

Stavros Therianos

AI总结 通过运行完整连接组作为固定速率模型,并与随机化网络比较,发现粗粒度接线统计决定整体动力学状态,而精确接线模式决定活动传播路径和回路几何结构。

Comments 20 pages, 6 figures. Supplementary Information provided as an ancillary file

详情
AI中文摘要

电子显微镜重建现已产生整个小型大脑的完整突触接线图,即连接组,包括第一个完全重建的昆虫大脑——果蝇幼虫。接线图单独在多大程度上固定电路的活动,相对于它未记录的更精细的生理细节,仍存在争议。我们将一个完整的连接组作为固定的、基于速率的动力学算子运行,其中没有单个神经元参数被拟合,因此在固定的动力学状态下,模型的行为反映接线及其连接强度,而非调谐的单神经元生理学,并将其与一系列随机化网络进行比较,每个随机化网络保留了接线更粗粒度的描述。模型的整体动力学状态,即其响应的强度和丰富程度,主要是统计性的:仅保留连接组粗粒度接线统计的网络能够重现它。超出这些统计的接线则设定活动传播的位置以及哪些回路塑造它。稀疏输入被限制在一个紧凑的嗅觉通路中,而随机化网络会淹没该通路;蘑菇体(昆虫学习中心)在主导伴随侧模式中占据过大作用,这些模式决定了哪些神经元塑造循环动力学。粗粒度统计设定状态;精确的连接模式设定几何结构,这种分离澄清了哪些基于连接组的论断仅依赖于接线。

英文摘要

Electron-microscopy reconstruction now yields complete synaptic wiring diagrams, or connectomes, of entire small brains, including the larval Drosophila, the first insect brain reconstructed in full. How far a wiring diagram alone fixes a circuit's activity, as opposed to the finer physiological detail it does not record, is debated. We run a complete connectome as a fixed, rate-based dynamical operator in which no single-neuron parameter is fitted, so that, at one fixed dynamical regime, the model's behavior reflects the wiring and its connection strengths rather than tuned single-neuron physiology, and compare it against a hierarchy of randomized networks that each preserve a coarser description of the wiring. The model's overall dynamical regime, how strongly and how richly it responds, is mostly statistical: networks keeping only the connectome's coarse wiring statistics reproduce it. The wiring beyond these statistics instead sets where activity travels and which circuits shape it. Sparse input is confined to a compact olfactory pathway that randomized networks flood, and the mushroom body, the insect learning center, takes an outsized role in the leading adjoint-side modes, the directions that weigh which neurons shape the recurrent dynamics. Coarse statistics set the regime; the precise pattern of connections sets the geometry, a separation that clarifies which connectome-based claims rest on wiring alone.

2606.17736 2026-06-17 q-bio.NC 新提交

Ten Years of the Stochastic Resonance Model of Tinnitus: From Phantom Perception to Adaptive Sensory Optimization

耳鸣的随机共振模型十年:从幻想到自适应感觉优化

Patrick Krauss, Achim Schilling

AI总结 本文综述了耳鸣的随机共振模型,该模型将耳鸣重新解释为听觉系统为补偿听力损失而自适应上调神经噪声的副产品,并总结了理论、实验和临床应用进展。

详情
AI中文摘要

主观性耳鸣——在没有外部声刺激的情况下感知声音——仍然是听觉神经科学中最具争议的现象之一。2016年,随机共振(SR)模型被引入作为耳鸣相关神经元过度活跃的替代解释,提出内部产生的神经噪声被自适应上调以恢复听力损失后的信息传递。该模型没有将增加的自发活动解释为适应不良,而是将其重新定义为一种功能机制,增强感觉阈值附近的信号检测,而耳鸣则作为自适应感觉优化的副作用出现。在过去十年中,这一框架已从现象学假设发展为更广泛的神经计算理论,将信息论、自适应信号检测、多通道听觉处理和跨模态可塑性联系起来。计算建模、大规模临床分析和动物实验为关键预测提供了汇聚支持,包括特定噪声条件下的可检测性改善和频率特异性幻听。该框架还启发了基于频谱匹配近阈值噪声刺激的治疗方法,并最近被整合到一个统一的听觉幻听解释中,该解释结合了随机共振、中枢增益、稳态可塑性和预测编码。本综述按时间顺序概述了随机共振模型的发展,总结了主要理论和实验进展,并指出了机制验证和临床转化的未来方向。通过将耳鸣重新定义为自适应感觉计算的结果,该模型将概念焦点从病理功能障碍转向神经系统中信息优化的原理。

英文摘要

Subjective tinnitus - the perception of sound in the absence of an external acoustic stimulus - remains one of the most debated phenomena in auditory neuroscience. In 2016, the stochastic resonance (SR) model was introduced as an alternative account of tinnitus-related neuronal hyperactivity, proposing that internally generated neural noise is adaptively upregulated to restore information transmission after hearing loss. Rather than interpreting increased spontaneous activity as maladaptive, the model reframed it as a functional mechanism that enhances signal detection near sensory thresholds, with tinnitus emerging as a side effect of adaptive sensory optimization. Over the past decade, this framework has evolved from a phenomenological hypothesis into a broader neurocomputational theory linking information theory, adaptive signal detection, multichannel auditory processing, and cross-modal plasticity. Computational modeling, large-scale clinical analyses, and animal experiments have provided converging support for key predictions, including improved detectability under specific noise conditions and frequency-specific phantom percepts. The framework has also inspired a therapeutic approach based on spectrally matched near-threshold noise stimulation and has recently been integrated into a unified account of auditory phantom perception that combines stochastic resonance, central gain, homeostatic plasticity, and predictive coding. This review provides a chronological overview of the development of the stochastic resonance model, summarizes major theoretical and empirical advances, and outlines future directions for mechanistic validation and clinical translation. By redefining tinnitus as a consequence of adaptive sensory computation, the model shifts the conceptual focus from pathological dysfunction toward principles of information optimization in neural systems.

2606.17457 2026-06-17 q-bio.SC 新提交

Aging induced structural alterations in SR-Mitochondria interaction in skeletal muscle: Emerging insights

衰老诱导的骨骼肌SR-线粒体相互作用结构改变:新见解

Unmod Senapati, Barsha Priyadarshini Kar, Sunil Pani, Naresh Chandra Bal

AI总结 本文综述了衰老过程中骨骼肌肌浆网与线粒体接触(MAMs)的结构和功能变化,探讨了运动、营养和药物干预对延缓MAMs丢失的作用。

详情
AI中文摘要

骨骼肌在衰老过程中经历显著变化,包括解剖、超微结构以及生化方面的改变。与衰老相关的肌肉质量减少,称为肌少症,是老年功能衰退和虚弱的主要因素,导致自信心下降。在成年骨骼肌纤维中,肌浆网(SR)和线粒体与肌膜(形成T-小管)一起表现出最复杂和精确的分布,这对肌肉功能至关重要。在健康的年轻肌肉组织中,SR和线粒体膜的紧密物理接近显示出称为线粒体相关膜(MAMs)的接触。最近的文献强调了MAMs网络通过调节Ca2+信号、脂质运输和其他信号分子(如活性氧)的定位,在肌肉平滑功能中的作用。提出了几种锚定机制来稳定MAMs网络,经典的是线粒体融合蛋白(MFN1和MFN2)。新兴共识表明,骨骼肌中的MAMs促进了兴奋-代谢耦合的准确性,确保空间能量供应。然而,在衰老过程中,SR和线粒体的共定位以及串扰的精确性似乎受到影响。在这篇综述中,我们批判性地审视了关于健康和疾病中MAMs网络结构和功能的当前文献,主要从衰老的角度出发。我们进一步评估了运动、营养、营养保健品和药理学方法在减少MAMs丢失以延缓衰老进展中的作用。保持骨骼肌健康与功能是实现健康老龄化目标的主要因素。

英文摘要

Skeletal muscle undergo remarkable changes during aging including anatomical, ultrastructural, and moreover biochemical. The aging associated reduction of muscle mass, termed as sarcopenia, is a major factor in geriatric functional decline and frailty, contributing to the lowering of self-confidence. In an adult skeletal muscle fibers, sarcoplasmic reticulum (SR) and mitochondria exhibit most intricate and precise distribution along with the sarcolemmal (forming T-tubule), which is critical for muscle function. In healthy young muscle tissue, the close physical proximity of SR and mitochondrial membranes shows contacts called mitochondria-associated membranes (MAMs). Recent literature highlights the role of MAMs network in smooth functioning of muscle by regulating localization of Ca2+-signaling, lipid transport, and other signalling molecules like reactive oxygen species. Several tethering mechanisms are proposed to stabilize the MAMs network, the classical ones being the mitofusins (MFN1 and MFN2). Emerging consensus suggest that MAMs in the skeletal muscle facilitate accuracy of excitation-metabolic coupling ensuring spatial energy supply. However, upon aging the precision of SR and mitochondria co-localization as well as crosstalk seems to be affected. In this review, we have critically examined the current literature about MAMs network structure and function during health and diseases mainly from an aging perspective. We have further evaluated the role of exercise, nutritional, nutraceutical and pharmacological approaches in lowering MAMs loss in an effort to retard aging progression. Retention of skeletal muscle health and performance is a major factor in achieving the goal of healthy aging.

2606.17277 2026-06-17 q-bio.OT 新提交

Accuracy, Repeatability, and Reproducibility of a Radiographic Technique to Assess Spinal Cord Stimulation Lead Position: A Validation Study

评估脊髓刺激电极位置的放射学技术的准确性、重复性和再现性:一项验证研究

Andrew Thoreson, Katrina Fernandez, Cesar Lopez, Margaux Linde, Mark A. Bendel, Peter Grahn, Kristin D. Zhao

AI总结 本研究开发了一种通过放射线片测量脊髓刺激电极位置的技术,并验证了其准确性、重复性和再现性,发现最小可检测变化小于相邻电极间距,且变异小于总变异的10%。

Comments 11 pages, 2 tables, 6 figures

详情
AI中文摘要

脊髓刺激通过植入电极是治疗多种慢性疼痛的有效疗法。然而,电极移位是导致疗效丧失的常见并发症。以往研究使用放射线片描述电极移位,但方法不一致且缺乏严格验证。本研究旨在开发一种测量腰骶椎管内硬膜外脊髓刺激电极位置的放射学技术,并确定其准确性、重复性和再现性。对三名经皮植入两个八触点圆柱形电极的临床试验参与者进行计算机断层扫描,通过三维测量确定电极位置,并生成数字重建放射线片。两名操作员对每个电极应用数字化和测量协议。创建Bland-Altman图以确定最小可检测变化,并进行量具重复性和再现性分析。发现最小可检测变化小于相邻电极间距,且重复性和再现性引入的变异小于总研究变异的10%。我们得出结论,所开发的测量电极位置的方法具有足够的准确性以及可接受的重复性和再现性。

英文摘要

Spinal cord stimulation with implantable leads is a valuable therapy used to treat a variety of chronic pain conditions. However, lead migration is a common complication causing loss of efficacy. Previous reports have characterized lead migration using radiographs, but methods are not consistent and lack rigorous validation. The purpose of this study was to develop a technique to perform radiographic measurements of the position of epidural spinal cord leads within the lumbosacral spinal canal and establish its accuracy, repeatability, and reproducibility. Computed tomography scans were acquired from three clinical trial participants implanted percutaneously with two eight-contact cylindrical leads; from these, electrode positions were established using three-dimensional measurements, and digitally reconstructed radiographs were created. Two operators applied a digitization and measurement protocol for each lead. Bland-Altman plots were created to determine smallest detectable change, and a gage repeatability and reproducibility analysis was performed. Smallest detectable change was found to be less than the distance between adjacent electrodes and variability introduced by repeatability and reproducibility was less than 10% of the total study variability. We conclude that the method developed to measure lead electrode position has sufficient accuracy and acceptable repeatability and reproducibility.

2606.17327 2026-06-17 q-bio.BM cs.AR cs.ET cs.NE 新提交

Energy-efficient codon optimization on thermodynamic hardware

热力学硬件上的节能密码子优化

Andraz Jelincic, Ross C. Walker

AI总结 本文将mRNA密码子优化问题映射到伊辛模型,在热力学采样单元上实现,相比GPU能耗降低约10^6倍,为热力学计算在制药领域的应用提供了首个具体实例。

Comments Preprint available on bioRxiv: DOI TBD

详情
AI中文摘要

计算能耗的不断增长正变得日益不可持续。热力学计算利用物理热涨落作为计算资源而非抑制它们,为概率性和组合性任务提供了数量级的节能。制药研发严重依赖计算优化和采样,是一个自然的应用领域。本文提出了据我们所知首个映射到热力学硬件的具体制药应用,并给出了基于原型测量的能耗估计。我们将mRNA密码子优化(药物开发中常规解决的组合问题)简化为从伊辛模型采样,使其可直接在热力学采样单元(TSU)上执行。在SARS-CoV-2刺突蛋白上对三种方法(Potts采样、伊辛采样和遗传算法基线)进行基准测试,发现所有方法均达到相当的优化质量(得分约234-240),但基于验证硬件模型的能耗估计表明,TSU解决该问题所需的能量约为传统GPU的10^6分之一。所有代码均以开源许可证发布。

英文摘要

The growing energy demand for computation is becoming increasingly unsustainable. Thermodynamic computing, which harnesses physical thermal fluctuations as a computational resource rather than suppressing them, offers orders-of-magnitude energy savings for probabilistic and combinatorial tasks. Pharmaceutical R&D, heavily reliant on computational optimization and sampling, is a natural application domain. Here we present what is, to our knowledge, the first concrete pharmaceutical application mapped to thermodynamic hardware with energy estimates grounded in prototype measurements. We reduce mRNA codon optimization, a combinatorial problem routinely solved in drug development, to sampling from an Ising model, making it directly executable on a thermodynamic sampling unit (TSU). Benchmarking three approaches (Potts sampling, Ising sampling, and a genetic algorithm baseline) on the SARS-CoV-2 spike protein, we find that all achieve comparable optimization quality (scores ~234-240), but energy estimates based on validated hardware models indicate that a TSU could solve this problem using approximately 10e6 times less energy than a conventional GPU. All code is released under an open-source license.

2606.17247 2026-06-17 eess.SP cs.ET 新提交

Large-scale Tunable Liquid Lens-assisted VLC Systems under Random Receiver Orientation

大规模可调谐液体透镜辅助VLC系统在随机接收器方向下的研究

Kapila W. S. Palitharathna, Constantinos Psomas, Gaofeng Pan, Ioannis Krikidis

AI总结 针对随机接收器方向下的大规模可见光通信系统,提出基于电润湿的可调谐液体透镜接收器架构,通过动态调整液面方向增强信号并抑制干扰,基于随机几何推导中断概率解析式,最佳信号接收策略相比传统固定透镜降低57.1%中断概率。

Comments This paper has been submitted to IEEE Transactions on Wireless Communications journal

详情
AI中文摘要

本文研究了在随机接收器方向下,可调谐液体透镜辅助接收器在大规模可见光通信系统中的性能。提出了一种简单的基于电润湿的TLL架构,能够通过调整液体界面方向动态地将入射光信号导向光电二极管接收器。该架构增强了期望信号接收,同时减轻了来自相邻接入点的干扰。AP的空间分布采用Matérn硬核点过程建模,而接收器方向由均匀分布的方位角和服从高斯分布的极角表征。此外,开发了一个易处理的光学信道数学模型,以捕捉AP/接收器位置、接收器方向和透镜调整角度对VLC信道增益的综合影响。基于此框架,提出了三种透镜方向策略:最佳信号接收、最近LED选择和垂直向上透镜方向,以改善动态接收器条件下的系统性能。利用随机几何工具,推导了每种方案的中断概率的精确和近似解析表达式。数值结果验证了所开发分析的准确性,并表明所提出的TLL辅助接收器架构在严重的接收器方向波动和密集AP部署下显著提高了VLC系统的鲁棒性。特别是,在AP高度为3.5 m、AP密度为0.2 m^{-2}时,BSR方案相比传统固定透镜接收器将中断概率降低了57.1%。所提出的分析框架和数值结果为未来TLL辅助VLC网络的部署提供了有用的设计见解。

英文摘要

This paper investigates the performance of tunable liquid lens (TLL)-assisted receivers in large-scale visible light communication (VLC) systems under random receiver orientation. A simple electrowetting-based TLL architecture is proposed, capable of dynamically steering the incident optical signal toward the photodiode receiver by adjusting the orientation of the liquid interface. The proposed architecture enhances the desired signal reception while mitigating interference from neighboring access points (APs). The spatial distribution of APs is modeled using a Matérn hard-core point process, whereas receiver orientation is characterized by uniformly distributed azimuth angles and Gaussian-distributed polar angles. Furthermore, a tractable mathematical optical channel model is developed to capture the combined effects of AP/receiver locations, receiver orientation, and lens adjustment angles on the VLC channel gain. Based on this framework, three lens orientation strategies, namely best signal reception (BSR), closest LED selection, and vertical upward lens orientation, are proposed to improve system performance under dynamic receiver conditions. Using stochastic geometry tools, exact and approximate analytical expressions for the outage probability are derived for each scheme. Numerical results verify the accuracy of the developed analysis and demonstrate that the proposed TLL-assisted receiver architecture significantly improves the robustness of VLC systems under severe receiver orientation fluctuations and dense AP deployments. In particular, the BSR scheme reduces the outage probability by $57.1\%$ compared with conventional fixed-lens receivers at an AP height of $3.5$ m and AP density of $0.2~\text{m}^{-2}$. The presented analytical framework and numerical results provide useful design insights for the deployment of future TLL-assisted VLC networks.

2606.18228 2026-06-17 cs.HC 新提交

MAJIC: Leveraging Articulatory Motion for Speech-based Emotion Recognition

MAJIC: 利用发音运动进行基于语音的情感识别

Tanmay Srivastava, Paras Bhavnani, Benjir Alvee Islam, Shubham Jain

AI总结 提出MAJIC多模态情感识别系统,通过融合下颌和面部肌肉的发音运动特征与音频特征,在多种语言和场景下实现93%准确率和91%F1分数,优于纯音频基线。

详情
AI中文摘要

我们介绍了MAJIC,一个多模态情感识别系统,它利用下颌和面部肌肉的发音运动进行基于语音的情感识别(SER)。虽然大多数SER系统在训练有素的演员强烈表达情感语音的数据集上表现良好,但当情感表达变得更加微妙时,它们的性能往往会下降。我们通过从发音运动中提取特征,并使用多任务学习框架将其与音频特征集成来探索这一挑战。我们的关键见解是,语音中的情感不仅通过声音特征表现出来,还通过不同的发音运动表现出来:下颌运动、面部肌肉振动和语音引起的振动。虽然音频捕获了音高和韵律等特征,但发音运动包含了音频中不存在的补充信息。我们在从20名参与者收集的数据上评估了我们的系统,这些数据跨越多个会话、10种语言以及包括提示语音和对话语音在内的多种场景,显示了其在用户和设置中的鲁棒性。MAJIC在情感分类上达到了93%的准确率和91%的F1分数,在我们的数据集上优于强大的基于音频的基线。

英文摘要

We introduce MAJIC, a multimodal emotion recognition system that leverages articulatory motion of the jaw and facial muscles for speech-based emotion recognition (SER). While most SER systems perform well on datasets with strongly expressed emotional speech of trained actors, their performance often degrades when emotional expressions become more subtle. We explore this challenge by engineering features from articulatory motion and integrating them with audio features using a multi-task learning framework. Our key insight is that emotion in speech manifests not only through vocal characteristics but also through distinct articulatory motions: jaw movements, facial muscle vibrations, and speech-induced vibrations. While audio captures features such as pitch and prosody, articulatory motion contains complementary information that is not present in audio alone. We evaluate our system on data collected from 20 participants across multiple sessions, 10 languages, and diverse scenarios, including prompted and conversational speech, showing its robustness across users and settings. MAJIC achieves 93% accuracy and 91% F1 score for emotion classification, outperforming strong audio-based baselines on our dataset.

2606.18225 2026-06-17 cs.DS cs.CC 新提交

Directed Reachability-Preserving Minimum Edge Cut: Approximation and Planar Hardness

保持可达性的有向最小边割:近似与平面难度

Qi Duan

AI总结 研究有向三终端保持可达性的最小边割问题,提出基于根线性近似的O(√r)近似算法,并证明有向平面版本是NP难的。

详情
AI中文摘要

我们研究了三终端保持可达性的最小边割问题的有向版本。给定一个有向图 $G=(V,A)$,其中弧具有成本,以及终端 $s_1,s_2,t$,单向有向 RPMEC 问题要求找到一个最小成本的弧集,删除该弧集后保持 $s_1\leadsto s_2$ 的可达性,同时破坏 $s_1\leadsto t$ 的可达性。我们首先给出了一个基于根有向割函数的路径-割公式。利用相关多拟阵的根线性近似,我们得到了一个 $O(\sqrt r)$-近似,其中 $r$ 是具有正单点割值的相关顶点数。特别地,这给出了在一般有向图中的 $O(\sqrt n)$-近似。对于有向无环图,我们给出了一个额外的单点长度算法,并得到了 $O(\min\{\sqrt r, h\})$ 的保证,其中 $h$ 是 $s_1$-$s_2$ 路径上相关顶点的最大数量。最后,我们证明了有向平面 RPMEC 是 NP 难的,即使在具有非负成本的有向无环平面图中也是如此,通过从三次平面图的独立集问题归约,使用有限双峰有向节点割构造和平面节点到边的分裂。

英文摘要

We study a directed version of the three-terminal reachability-preserving minimum edge cut problem. Given a directed graph $G=(V,A)$ with arc costs and terminals $s_1,s_2,t$, the one-way directed RPMEC problem asks for a minimum-cost set of arcs whose deletion preserves the reachability $s_1\leadsto s_2$ while destroying the reachability $s_1\leadsto t$. We first give a path--cut formulation in terms of a rooted directed cut function. Using a root-linear approximation for the associated polymatroid, we obtain an $O(\sqrt r)$-approximation, where $r$ is the number of relevant vertices with positive singleton cut value. In particular this gives an $O(\sqrt n)$-approximation in general directed graphs. For acyclic directed graphs, we give an additional singleton-length algorithm and obtain an $O(\min\{\sqrt r,h\})$ guarantee, where $h$ is the maximum number of relevant vertices on an $s_1$-$s_2$ path. Finally, we prove that directed planar RPMEC is NP-hard, even on acyclic planar digraphs with nonnegative costs, by reducing from independent set on cubic planar graphs through a finite-bimodal directed node-cut construction and a planar node-to-edge split.

2606.18220 2026-06-17 cs.CR cs.DC 新提交

Gatling: Rapid-Fire Consensus from Parallel Composition

Gatling: 来自并行组合的快速射击共识

Giulia Scaffino, Max Resnick, Joachim Neu

AI总结 提出Gatling协议,通过并行运行多个原子广播实例并交错提案调度,将提案间隔降至低于网络延迟,从而提升区块链共识性能。

详情
AI中文摘要

共识协议构成了区块链和其他复制状态机的核心,确保所有正确节点处理相同全序的输入交易日志。在无故障执行中,性能由良好情况下的交易延迟驱动——即交易被所有节点知晓到被共识协议确认之间的时间——这既取决于提案的频率,也取决于提案一旦被提出后确认的速度。虽然先前的工作已经建立了现代协议已经实现的确认延迟的严格下界,但提案间时间能否进一步降低到低于一个网络延迟的最先进水平仍是一个开放问题。我们引入了Gatling,一种原子广播协议,在轮换领导者调度下实现了任意小的提案间时间;特别是小于网络延迟。Gatling运行多个黑盒原子广播协议的并行实例,并交错它们的提案调度,以比最先进协议更快的速度生成提案。一个确定性的交织规则将这些实例的输出合并成一个全局日志。我们分析了由崩溃领导者引起的队头阻塞的影响,并推导出Gatling的最优并行实例数。我们进一步研究了Gatling对可预测有效性的影响,并提出了两种保留该属性的变体。最后,我们的实验证实,Gatling可以与现成的组件协议一起使用,而无需为最小延迟微调组件协议,即可实现低延迟。

英文摘要

Consensus protocols form the core of blockchains and other replicated state machines, ensuring that all correct nodes process the same totally ordered log of input transactions. In fault-free executions, performance is driven by the good-case transaction latency -- the time between a transaction becoming known to all nodes and its confirmation by the consensus protocol -- which depends on both how frequently proposals are made and, once made, how quickly they are confirmed. While prior work has established tight lower bounds on confirmation latency that modern protocols already achieve, it remains open whether the inter-proposal time can be further reduced below the state-of-the-art of one network delay. We introduce Gatling, an atomic broadcast protocol that achieves arbitrarily small inter-proposal times under rotating leader schedules; in particular, smaller than the network delay. Gatling runs multiple parallel instances of a black-box atomic broadcast protocol and staggers their proposal schedules to generate proposals in faster succession than state-of-the-art protocols. A deterministic interleaving rule merges the outputs of these instances into a single global log. We analyze the effects of head-of-line blocking caused by crashed leaders, and derive Gatling's optimal number of parallel instances. We further study the impact of Gatling on predictable validity and present two variants that retain this property. Finally, our experiments confirm that Gatling can be used with off-the-shelf component protocols to achieve low latency without fine-tuning the component protocol for minimum latency.

2606.18215 2026-06-17 cs.DB 新提交

A benchmark suite of intracellular Boolean model variants and multiscale simulations for computational biology

计算生物学细胞内布尔模型变体与多尺度模拟基准套件

Marco Masera, Riccardo Smeriglio, Roberta Bardini, Alessandro Savino, Stefano Di Carlo

AI总结 提出PhysiBench,包含612个可执行布尔调控网络变体和12万条多尺度随机模拟数据,支持系统生物学方法开发与评估。

Comments 20 pages, 4 figures, submitted as a Data Descriptor paper to Nature Scientific Data

详情
AI中文摘要

我们提出PhysiBench,这是一个用于开发和评估系统生物学计算方法的开放资源,包括612个可执行的细胞内布尔调控网络变体基准套件和12万条时间分辨的多尺度随机模拟数据集。基准模型源自七个已发表的布尔网络,涵盖细胞周期控制、发育模式、癌症信号、免疫反应和细胞命运决定,并可在PhysiBoSS/PhysiCell多尺度模拟框架中执行。模型变体通过基于突变的模型构建、在线行为过滤和离线敏感性评估生成。模拟数据集来自60个选定模型,在系统采样的刺激协议和固定模型级初始配置下产生。每条轨迹链接到其模型标识符、输入参数文件、随机种子和细胞级输出文件。PhysiBench支持直接模拟、替代建模、数据驱动推理、基于模拟的优化和比较基准测试。技术验证包括文件完整性和可执行性检查、基于图的结构多样性分析以及多尺度模拟输出的行为异质性评估。

英文摘要

We present PhysiBench, an open resource for developing and evaluating computational methods in systems biology including a benchmark suite of 612 executable intracellular Boolean regulatory network variants and a dataset of 120,000 time-resolved multiscale stochastic simulations. The benchmark models are derived from seven published Boolean networks spanning cell-cycle control, developmental patterning, cancer signaling, immune response, and cell-fate decisions, and are executable in the PhysiBoSS/PhysiCell multiscale simulation framework. Model variants are generated through mutation-based model construction, online behavioral filtering, and offline sensitivity evaluation. The simulation dataset is produced from 60 selected models under systematically sampled stimulation protocols and fixed model-level initial configurations. Each trajectory is linked to its model identifier, input-parameter file, stochastic seed, and cell-level output file. PhysiBench supports direct simulation, surrogate modeling, data-driven inference, simulation-based optimization, and comparative benchmarking. Technical validation includes file-integrity and executability checks, graph-based structural diversity analyses, and behavioral heterogeneity assessment from multiscale simulation outputs.

2606.18210 2026-06-17 cs.SI 新提交

Structural and Temporal Hallmarks of Genealogical Networks

家系网络的结构与时间标志

Japheth Carlson, Teayoun Kim, Matthew Lawyer, Wyatt Pochman, Emeline Thygerson, Benjamin Webb

AI总结 通过结合网络理论与推断时间的方法,分析上百个家系数据集,发现人类亲缘网络具有无标度度分布、小世界等普适结构特征,并引入伪世代提取时间结构。

详情
AI中文摘要

家系领域的快速增长,涵盖拥有数十亿记录和数百万用户的平台,产生了可供分析的最大、最复杂的网络之一。尽管家系网络研究取得了重大进展,但人类亲缘网络是否表现出普遍的结构特性仍不清楚。我们通过开发一种结合网络理论结构与推断时间概念的综合方法来解决这一问题。利用Kinsources库中的一百多个数据集,我们用家系术语重新解释标准网络度量,并引入\emph{伪世代},一种直接从网络拓扑中提取时间结构的方法。在此框架内,我们识别出跨数据集共享的共同特征。我们发现家系网络表现出无标度式的度和组件大小分布、多尺度家庭组织,以及基于遗传和婚姻距离的小世界行为。我们展示了2-组件提供了家系结构的自然单位,观察到一致的非同配混合,并发现记录的婚姻相对于潜在配对强烈偏向于短遗传距离。我们还记录了时间和人口统计模式,包括记录的亲代和子代信息的变化,以及记录婚姻、亲代和子代之间的相关性。这些结果表明,多样化的家系数据集共享一组共同的结构和时间特征,为人类亲缘网络的普遍特征提供了证据,并为其比较分析建立了通用框架。

英文摘要

The rapid growth of the genealogical sector, spanning platforms with billions of records and millions of users, has produced some of the largest and most complex networks available for analysis. Despite substantial advances in genealogical network research, it remains unclear whether human kinship networks exhibit universal structural properties. We address this by developing an integrated approach to genealogical network analysis that combines network-theoretic structure with an inferred notion of time. Using over one hundred datasets from the Kinsources repository, we reinterpret standard network measures in genealogical terms and introduce \emph{pseudogenerations}, a method for extracting temporal structure directly from network topology. Within this framework, we identify common features shared across datasets. We find that genealogical networks exhibit scale-free--like degree and component-size distributions, multiscale family organization, and small-world behavior with respect to genetic and union-based distances. We show that 2-components provide a natural unit of genealogical structure, observe consistent disassortative mixing, and find that recorded unions are strongly biased toward short genetic distances relative to potential pairings. We also document temporal and demographic patterns, including shifts in recorded parental and child information, as well as correlations among recorded unions, parents, and children. These results suggest that diverse genealogical datasets share a common set of structural and temporal characteristics, providing evidence for universal features of human kinship networks and establishing a general framework for their comparative analysis.