Patents List

Title Filing Date Status Number / URL 主要方向
Coprocessor prefetcher 2024-07-25 latest continuation; original 2021-12-10 Grant / Application US11755333B2, https://patents.google.com/patent/US11755333B2/en; US12050918B2, https://patents.google.com/patent/US12050918B2/en; US20250094174A1, https://patents.google.com/patent/US20250094174A1/en 协处理器 operand-address 捕获预取;非 CPU DMP,但属于 Apple 近年 prefetcher/indirect operand fetch 布局
Deny list for a memory prefetcher circuit 2023-05-24 Grant US12306762B1, https://patents.google.com/patent/US12306762B1/en 无用预取过滤、deny list、pollution control
Multi-table signature prefetch 2021-07-21 Grant US11630670B2, https://patents.google.com/patent/US11630670B2/en instruction/control-flow signature prefetch;与数据预取的 signature 机制相邻
Multi-block cache fetch techniques 2021-05-19 Grant US12248399B2, https://patents.google.com/patent/US12248399B2/en cache miss 后多 block 聚合 fetch;接近 adjacent/sector prefetch fill path
Low latency fetch circuitry for compute kernels 2020-10-08 continuation; original 2018-09-26 Grant US10838725B2, https://patents.google.com/patent/US10838725B2/en; US11256510B2, https://patents.google.com/patent/US11256510B2/en compute command stream 中提前解析 indirect-data-access items;GPU/compute front-end 的 IMA-like fetch,而非 CPU cache DMP
Secondary prefetch circuit that reports coverage to a primary prefetch circuit to limit prefetching by primary prefetch circuit 2020-03-27 Grant US11176045B2, https://patents.google.com/patent/US11176045B2/en primary/secondary prefetcher 协同、coverage hint、large-stride/SMS
Sequential prefetch boost 2017-04-26 Grant US10346309B1, https://patents.google.com/patent/US10346309B1/en 顺序流 prefetch boost / prefetch depth 调节
Prefetch circuit with global quality factor to reduce aggressiveness in low power modes 2017-02-17 Grant US10331567B1, https://patents.google.com/patent/US10331567B1/en global quality factor、low-power throttling、large-stride/SMS adjunct
Unified prefetch circuit for multi-level caches 2016-04-07 Grant US10180905B1, https://patents.google.com/patent/US10180905B1/en; continuation US10621100B1, https://patents.google.com/patent/US10621100B1/en AMPM/access-map prefetch、multi-level cache target、per-cache quality factor
Prefetch throttling in a multi-core system 2016-04-07 Grant US9904624B1, https://patents.google.com/patent/US9904624B1/en shared-cache/request-queue pressure、low-confidence prefetch throttling
Access map-pattern match based prefetch unit for a processor 2013-07-16 Grant US9015422B2, https://patents.google.com/patent/US9015422B2/en AMPM/access-map pattern match、wildcard、quality factor
Pointer chasing prediction 2013-05-09 Grant US9116817B2, https://patents.google.com/patent/US9116817B2/en dependent load 提前调度与 LSU-internal forwarding;IMA/pointer-chasing 支撑机制
Prefetching across page boundaries in hierarchically cached processors 2012-11-29 Grant US9047198B2, https://patents.google.com/patent/US9047198B2/en upper-level prefetcher 提前获取 next-page translation,避免 lower-level stream prefetcher 在页边界停顿
Converting memory accesses near barriers into prefetches 2012-07-17 Grant US8856447B2, https://patents.google.com/patent/US8856447B2/en memory barrier 后受阻 load/store 转换为 prefetch request;ARM DMB/DSB 场景的 latency hiding
Coordinated prefetching in hierarchically cached processors 2012-03-20; EP filed 2013-03-18 Grant US9098418B2, https://patents.google.com/patent/US9098418B2/en; EP2642398B1 / EP2642398A1, https://patents.google.com/patent/EP2642398A1/en 多级 cache 层级协同、stream training、prefetch target selection
Prefetch unit 2009-01-07 Grant US7779208B2, https://patents.google.com/patent/US7779208B2/en; continuation US7996624B2, https://patents.google.com/patent/US7996624B2/en 多 active stream、software/hardware initiated data cache prefetch

Stream Prefetching

  1. Sequential prefetch boost (Apple, US10346309B1, filed 2017-04-26)

    • 背景:顺序 stream 预取有时训练保守,无法快速把预取距离拉到足够隐藏内存延迟。
    • 核心设计:在检测到稳定 sequential access / stream 后提升 prefetch behavior,例如增加深度或更积极地推进预取位置。
    • 关联方向:这是常规 stream prefetch 增强;但可作为 Apple 数据预取子系统中的低复杂度高覆盖组件。
  2. Prefetch unit (Apple / originally P.A. Semi lineage, US7779208B2, filed 2009-01-07; continuation US7996624B2, filed 2010-07-06)

    • 背景:早期 Apple-assignee 处理器预取专利仍以 data cache stream prefetch 为核心。
    • 核心设计:prefetch unit 连接 data cache,同时维护多个 active prefetch streams;stream 可由 software prefetch instruction 启动,也可由 load/store miss 硬件启动。
    • 关联方向:代表 Apple/PA Semi 早期 data-cache stream prefetch 基线,后续 AMPM/access-map 与 multi-level 控制可看作在该基础上的系统化扩展。
  3. Prefetching across page boundaries in hierarchically cached processors (Apple, US9047198B2, filed 2012-11-29)

    • 背景:lower-level prefetch units 通常跑在 upper-level prefetch units 前面,遇到 virtual page boundary 时若没有 next-page translation 会停顿。
    • 核心设计:upper-level prefetch unit 在接近页边界时预先请求 next-page translation,并在 lower-level prefetch units 到达当前页末尾前传递该 translation,使其可跳到 next physical page 继续 prefetch。
    • 关联方向:不是 DMP/IMA,但它补齐了 Apple 多级 stream prefetch 在页边界、TLB/translation 和 L2/L3 预取连续性上的设计。

Spatial Prefetcher

  1. Access map-pattern match based prefetch unit for a processor (Apple, US9015422B2, filed 2013-07-16)
    • 背景:out-of-order execution 会让 cache access map 出现噪声,固定 stride/stream 不能表达很多 irregular-but-repeatable pattern。
    • 核心设计:在 access map memory 中记录 region 内 cache-block access 状态,用 access pattern memory 匹配 pattern;pattern 可包含 wildcard,并用 quality factor 控制预取速率。
    • 关联方向:这是 Apple AMPM-style data prefetcher 的基础专利,和 region/spatial footprint 预取器关系很近。

IMA/DMP

Title Filing Date Number URL
Content-directed prefetch circuit with quality filtering 2016-08-25 US9886385B1 https://patents.google.com/patent/US9886385B1/en
Prefetch circuit for a processor with pointer optimization 2015-06-24 US9971694B1, https://patents.google.com/patent/US9971694B1/en; continuation US10402334B1, https://patents.google.com/patent/US10402334B1/en pointer-aware load/store prefetch
  1. CDP: Content-directed prefetch circuit with quality filtering (Apple, US9886385B1, filed 2016-08-25)

    • 背景:pointer chasing 和 linked data structure 不能被简单 stride/stream 预取很好覆盖;从缓存行内容中识别 pointer candidate 又容易误判。
    • 核心设计:从 loaded cache line 中寻找 memory pointer candidate,并用 quality factor table 过滤;该表可由 PC 与 relative cache-line offset 等上下文索引,达到阈值后才允许 prefetch。
    • 关联方向:这是目前公开专利中最接近 Apple CPU DMP 的核心文本:它明确扫描 data cache line fill 内容、识别 memory pointer candidate,并基于质量反馈决定是否预取 candidate 指向的 cache line。
  2. AMPM-Pointer: Prefetch circuit for a processor with pointer optimization (Apple, US9971694B1, filed 2015-06-24; US10402334B1, filed 2018-04-09)

    • 背景:pointer-heavy code 中 load/store 序列可能包含可提前识别的目标地址,但普通 AMPM/stride 机制覆盖不足。
    • 核心设计:在 processor prefetch circuit 中加入 pointer optimization,围绕 load/store access、cache state 和 prefetch request 控制 pointer-aware prefetch。
    • 关联方向:和 Content-directed prefetch circuit with quality filtering 一起构成 Apple 对 pointer/pointer-like data access 的专利布局。

Pointer Chasing Prediction

Title Filing Date Number URL
Reducing latency for pointer chasing loads 2014-04-29 US9710268B2, https://patents.google.com/patent/US9710268B2/en pointer chasing load-to-load/store 地址生成旁路;不是预取器,但解释 Apple 对 IMA/pointer-chasing latency 的硬件优化
  1. Reducing latency for pointer chasing loads (Apple, US9710268B2, filed 2014-04-29)

    • 背景:pointer chasing 中 producer load 的结果会作为 younger dependent load/store 的地址输入,常规通过 register file / reservation station 转发会增加 load-to-load latency。
    • 核心设计:当 producer load 预计不会命中 store queue 时,dependent load/store 可提前 issue;producer load 结果从 data cache 直接旁路到 address generation unit,用于生成 dependent load/store 地址。
    • 关联方向:不是预取器,但它直接面向 pointer chasing latency,是 IMA/DMP 讨论中值得放在旁边的 Apple load-use 旁路优化。
  2. Pointer chasing prediction (Apple, US9116817B2, filed 2013-05-09)

    • 背景:linked list traversal 等 pointer chasing 会形成 older producing load -> younger consuming load 的依赖链。
    • 核心设计:scheduler 预测 producing load 的结果适合 LSU-internal forwarding,且 younger load 依赖该结果时,提前 issue younger load;LSU 将 producing load 的结果转发给 address generation logic。
    • 关联方向:同样不是 prefetcher,而是 dependent-load scheduling;与 Apple DMP/IMA 共同瞄准 pointer-chasing/indirect-address latency。

Instructions Prefetching

  1. Multi-table signature prefetch (Apple, US11630670B2, filed 2021-07-21)
    • 背景:单一 signature generation technique 对不同 control-flow path 的历史长度和混叠表现不一致。
    • 核心设计:使用多个 signature generation technique 和多个 signature prefetch table,在 training event 上更新并用于 future instruction/control-flow prefetch。
    • 关联方向:它更偏 instruction/control-flow prefetch,不宜作为数据预取器直接类比;但 multi-table signature、history hashing 和 confidence 管理思路可借鉴到 data delta/signature prefetcher。

Prefetching Throttling

  1. Deny list for a memory prefetcher circuit (Apple, US12306762B1, filed 2023-05-24)

    • 背景:aggressive prefetcher 会把未使用的 cache line 带入 cache,造成污染、带宽和功耗浪费。
    • 核心设计:cache eviction 时读取 prefetch indicator,如果 prefetched line 被 evicted untouched,则将地址或地址组加入 prefetch deny list;后续命中 active deny-list entry 的 prefetch request 被拒绝。
    • 关联方向:这是工业界很直接的 useless-prefetch filter,可作为 FDP/PPF/NST 之外的轻量负反馈节流机制。
  2. Prefetch throttling in a multi-core system (Apple, US9904624B1, filed 2016-04-07)

    • 背景:多核共享 lower-level cache / memory subsystem 时,一个 core 的 prefetch 可能阻塞另一个 core 的 demand fetch。
    • 核心设计:shared/external cache 侧按 processor 统计 request queue 中 demand fetch 与 low-confidence prefetch 的 occupancy;根据阈值和历史样本向 core 发送 throttle control,逐步限制低置信度预取。
    • 关联方向:与 HPAC/SPAC/NST 一类多核 prefetch throttling 目标一致,但更强调 shared cache queue pressure 与 per-core 反馈。

Multi-Prefetcher Management

  1. Secondary prefetch circuit that reports coverage to a primary prefetch circuit to limit prefetching by primary prefetch circuit (Apple, US11176045B2, filed 2020-03-27)

    • 背景:多个 data prefetcher 使用不同机制时,可能同时覆盖同一 data stream,造成 over-prefetching、低准确率和额外功耗。
    • 核心设计:primary prefetch circuit 对 demand access 生成预取并调用 secondary prefetch circuit;secondary circuit 达到 threshold confidence 后回传 coverage hint,primary circuit 据此减少对应 demand/access-map 的预取数量。
    • 关联方向:非常贴近多预取器协同问题;专利明确提到 large stride prefetch circuit 与 Spatial Memory Streaming (SMS) prefetch circuit 可作为 secondary prefetcher。
  2. Unified prefetch circuit for multi-level caches (Apple, US10180905B1, filed 2016-04-07; continuation US10621100B1, filed 2018-12-05)

    • 背景:同一个 access pattern 对 L1/data cache、L2/LLC 等不同层级的最佳填充位置不同;分散的预取器难以统一控制准确率和目标层级。
    • 核心设计:基于 access map-pattern match 的 prefetch circuit 统一观察 load/store demand accesses,按 pattern 产生面向不同 cache level 的 prefetch,并用 per-cache quality factor 控制各层预取。
    • 关联方向:这基本是 Apple AMPM-style 数据预取器的 multi-level 版本,对 L1/L2/LLC target selection 和 per-level feedback 很有参考价值。
  3. Coordinated prefetching in hierarchically cached processors (Apple, US9098418B2, filed 2012-03-20; EP2642398B1 / EP2642398A1 filed 2013-03-18)

    • 背景:多级 cache 中,不同层级 prefetcher 若独立训练和发请求,容易重复、错层填充或互相干扰。
    • 核心设计:以单一 unified training mechanism 训练 core 产生的 stream;core 向 lower-level cache 发送 prefetch request 时携带 stream ID 和 training information,下级 cache 依此生成自己的 prefetch。
    • 关联方向:比单体 AMPM 更接近 L1/L2/LLC 协同;适合与 Intel physical-page prefetch、AMD throttling、Arm pattern selection 放在同一类讨论。

Coprocessor Prefetcher

  1. Coprocessor prefetcher (Apple, US11755333B2 / US12050918B2 / US20250094174A1, original filed 2021-12-10; latest continuation filed 2024-07-25)
    • 背景:processor 与 coprocessor 混合执行时,coprocessor operand data 的地址由 processor 侧代码序列生成,若等到 coprocessor 执行再取数会暴露额外延迟。
    • 核心设计:coprocessor prefetcher 监控 processor 获取的 code sequence,识别 coprocessor instructions 后捕获 processor 生成的 operand memory addresses,并在 coprocessor 执行前向其可访问 cache 发起 prefetch。
    • 是 Apple 2023-2025 仍在延续的 prefetcher 专利族,说明 Apple 仍在围绕 indirect operand fetch / heterogeneous execution latency hiding 做布局

Power

  1. Prefetch circuit with global quality factor to reduce aggressiveness in low power modes (Apple, US10331567B1, filed 2017-02-17)
    • 背景:移动 SoC 中 prefetch performance 与 energy/battery life 之间需要动态平衡。
    • 核心设计:在 prefetch circuit 中维护 per-entry quality factor 之外的 global quality factor;当 outstanding prefetch 或低功耗模式下的整体质量较低时,减少生成的 prefetch request。
    • 关联方向:把 AMPM/access-map、large-stride prefetch、SMS-like spatial mechanism 和 global throttling 放在同一 prefetch subsystem 中,是 Apple 数据预取器专利族的关键连接点。