基于增强型强化学习的非规则访存预取框架

杨钰泽, 黄正伟, 王永文

集成电路与嵌入式系统 ›› 0

集成电路与嵌入式系统 ›› 0 DOI: 10.20193/j.ices2097-4191.2026.0006

基于增强型强化学习的非规则访存预取框架

  • 杨钰泽, 黄正伟, 王永文
作者信息 +

An Irregular Memory Access Prefetching Framework Based on Enhanced Reinforcement Learning

  • 杨 钰泽
Author information +
文章历史 +

摘要

高性能计算和智能计算中非规则数据访问模式常常导致传统数据预取技术失效,而以往依赖利用固定规则或基于特定程序上下文的离线学习模型也难以适应运行时动态变化的访存模式。Pythia强化学习(RL)预取框架通过在线学习展现出良好的适应性,但在极端非规则负载下仍需要针对性调优,限制了实际应用的泛化能力。文章提出一种上下文感知的强化学习预取框架IEP(Irregular Enhanced Pythia),旨在增强对非规则访存模式的预测能力。该框架包含两方面创新:其一,提出非规则特征增强模块,引入地址位掩码和访问顺序距离两类状态特征,捕捉内存分配器行为在时间和空间上潜藏的规律,从而提升对非规则访存的表征能力;其二,提出分层奖励策略模块,通过信心感知与带宽敏感的动态奖励机制,精细化地引导智能体学习过程,从而加速策略优化并提升最终性能。实验基于ChampSim仿真器,测试多种非规则负载。实验结果表明,相比Pythia框架,本方案在Ligra、PARSEC等典型非规则负载上的平均准确率最高提升2.27%,单核平均IPC最高提升2.90%,并在多核环境下保持稳定性能优势。

Abstract

Irregular data access patterns in high-performance computing and intelligent computing often render traditional data prefetching techniques ineffective. Existing models that rely on fixed rules or offline learning based on specific program contexts also struggle to adapt to dynamically changing memory access patterns during runtime. While the Pythia reinforcement learning (RL) prefetching framework demonstrates adaptability through online learning, it still requires manual tuning under extreme irregular workloads, limiting its generalization in practical applications. This paper proposes IEP(Irregular Enhanced Pythia), a context-aware reinforcement learning prefetching framework to enhance the prediction capability for irregular memory access patterns. The framework introduces two key innovations: first, an irregular feature enhancement module that incorporates address bit masks and access sequence distance as state features to capture hidden spatiotemporal patterns in memory allocator behavior, thereby improving the representation of irregular memory accesses; second, a hierarchical reward strategy module that employs a dynamic reward mechanism combining confidence awareness and bandwidth sensitivity to finely guide the learning process of the agent, accelerating policy optimization and improving final performance. Experiments were conducted using the ChampSim simulator, testing various irregular workloads. Results show that compared to the Pythia framework, the proposed solution achieves a maximum improvement of 2.27% in average prefetching accuracy and 2.90% in average single-core IPC for typical irregular workloads such as Ligra and PARSEC, while maintaining stable performance advantages in multi-core environments.

关键词

非规则访存 / 硬件预取 / 强化学习 / ChampSim / 状态特征 / 动态奖励机制

Key words

irregular memory access / hardware prefetching / reinforcement learning / ChampSim / state features / dynamic reward mechanism

引用本文

导出引用
杨钰泽, 黄正伟, 王永文. 基于增强型强化学习的非规则访存预取框架[J]. 集成电路与嵌入式系统. 0 https://doi.org/10.20193/j.ices2097-4191.2026.0006
杨 钰泽. An Irregular Memory Access Prefetching Framework Based on Enhanced Reinforcement Learning[J]. Integrated Circuits and Embedded Systems. 0 https://doi.org/10.20193/j.ices2097-4191.2026.0006

基金

高层次科技创新人才工程人选自主科研项目(22-TDRCJH-02-006)

Accesses

Citation

Detail

段落导航
相关文章

/