芯粒功能划分方法与互连体系综述

陈龙; 黄乐天

doi:10.20193/j.ices2097-4191.2024.02.005

PDF(9787 KB)

集成电路与嵌入式系统 ›› 2024, Vol. 24 ›› Issue (2) : 41-49. DOI: 10.20193/j.ices2097-4191.2024.02.005

Chiplet研究专栏

芯粒功能划分方法与互连体系综述

陈龙 ,
黄乐天 ^*

作者信息 +

Chipet functional partitioning and interconnection review

CHEN Long ,
HUANG Letian ^*

Author information +

文章历史 +

摘要

目前,芯片设计面临“面积墙”的挑战,这为芯片制造带来了高昂的流片成本。芯粒技术可以通过成熟的工艺制程制造较小面积的芯片,然后通过先进封装方式打破面积墙的限制,实现芯片的敏捷设计,降低设计成本。而设置多大的芯粒颗粒度可以满足芯片设计的灵活需求,是利用芯粒技术的一个核心问题。芯粒功能的划分也影响着芯粒间的互连结构,如何实现各功能芯粒间互连是最终实现芯片功能的关键。因此,本文综述国内外近年来对芯粒功能划分上的研究、在芯粒设计空间上的探索以及芯粒功能划分对芯粒间互连网络影响,并指出芯粒的设计方法学是未来芯粒技术发展的重要研究方向。

Abstract

Facing the challenge of the "area wall" in chip design,there is a significant increase in chip manufacturing costs.The chiplet technology enables the production of small area chips using a mature process,and composing by advanced packaging techniques,which can overcome the limitations imposed by the area wall,facilitating agile chip design and reducing overall design costs.Determining an optimal chiplet particle size to meet flexible chip design requirements remains a crucial issue when utilizing chiplet technology.Furthermore,achieving interconnectivity between functional chiplets after dividing chip functions is pivotal for realizing the final functionality of the chip.Therefore,this paper provides a comprehensive review of recent research on chiplet function division,spatial exploration in chiplet design and the influence of chiplet function division on the inter-chip interconnect,while also pointing out that chipet design methodology is an important research direction for the development of chiplet technology in the future.

导出引用

陈龙, 黄乐天. 芯粒功能划分方法与互连体系综述[J]. 集成电路与嵌入式系统. 2024, 24(2): 41-49 https://doi.org/10.20193/j.ices2097-4191.2024.02.005

CHEN Long, HUANG Letian. Chipet functional partitioning and interconnection review[J]. Integrated Circuits and Embedded Systems. 2024, 24(2): 41-49 https://doi.org/10.20193/j.ices2097-4191.2024.02.005

中图分类号： TN47 (大规模集成电路、超大规模集成电路)

参考文献

列表( 原文顺序 | 文献年度倒序 | 文中引用次数倒序 ) 可视化分析

[1]	MEENDERINCK C, JUURLINK B. (When) Will CMPs Hit the Power Wall[C]// European Conference on Parallel Processing.Springer,Berlin,Heidelberg, 2009.DOI:10.1007/978-3-642-00955-623. 本文引用 [1]

[2]	W A WULF, S A MCKEE. Hitting the memory wall: implications of the obvious[J]. SIGARCH Comput. Archit. News, 1995, 23(1):20-24. 本文引用 [1]

[3]

SANTORO

, G.TURVANI,M. GRAZIANO. New logic-in-memory paradigms:An architectural and technological perspective[J]. Micromachines, 2019, 10(6):368,.

https://doi.org/10.3390/mi10060368

https://www.mdpi.com/2072-666X/10/6/368

本文引用 [1] 摘要

Processing systems are in continuous evolution thanks to the constant technological advancement and architectural progress. Over the years, computing systems have become more and more powerful, providing support for applications, such as Machine Learning, that require high computational power. However, the growing complexity of modern computing units and applications has had a strong impact on power consumption. In addition, the memory plays a key role on the overall power consumption of the system, especially when considering data-intensive applications. These applications, in fact, require a lot of data movement between the memory and the computing unit. The consequence is twofold: Memory accesses are expensive in terms of energy and a lot of time is wasted in accessing the memory, rather than processing, because of the performance gap that exists between memories and processing units. This gap is known as the memory wall or the von Neumann bottleneck and is due to the different rate of progress between complementary metal–oxide semiconductor (CMOS) technology and memories. However, CMOS scaling is also reaching a limit where it would not be possible to make further progress. This work addresses all these problems from an architectural and technological point of view by: (1) Proposing a novel Configurable Logic-in-Memory Architecture that exploits the in-memory computing paradigm to reduce the memory wall problem while also providing high performance thanks to its flexibility and parallelism; (2) exploring a non-CMOS technology as possible candidate technology for the Logic-in-Memory paradigm.

[4]	C XUE, T HUANG, J LIU, et al. A 22nm 2mb reram computein-memory macro with 121-28tops/w for multibit MAC computing for tiny AI edge devices[C]// IEEE International Solid-State Circuits Conference (ISSCC), 2020:244-246. 本文引用 [1]

[5]

YUE

, Z

YUAN

, X

FENG

, et al. A 65nm computing-in-memory-based CNN processor with 2.9-to-35.8tops/w system energy efficiency using dynamic-sparsity performance-scaling architecture and energy-efficient inter/intra-macro data reuse[C]// IEEE International Solid-State Circuits Conference (ISSCC), 2020:234-236.

本文引用 [1]

[6]	T SINGH, S RANGARAJAN, D JOHN, et al. Zen 2:The AMD 7nm energy-efficient high-performance x86-64 microprocessor core[C]// IEEE International Solid-State Circuits Conference (ISSCC), 2020:42-44. 本文引用 [3]

[7]	N BECK, S WHITE, M PARASCHOU, et al. ‘zeppelin’:An soc for multichip architectures[C]//IEEE International Solid-State Circuits Conference (ISSCC), 2018:40-42. 本文引用 [2]

[8]	S NAFFZIGER. Pioneering Chiplet Technology and Design for the AMD EPYC and Ryzen Processor Families:Industrial Product[C]// 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA),Valencia,Spain, 2021:57-70.DOI:10.1109/ISCA52012.2021.00014. 本文引用 [1]

[9]	J XIA, C CHENG, X ZHOU, et al. Kunpeng 920:The First 7-nm Chiplet-Based 64-Core ARM SoC for Cloud Services[J]. IEEE Micro, 2021, 41(5):67-75.DOI:10.1109/MM.2021.3085578. 本文引用 [3]

[10]	S PAL. Designing a 2048-Chiplet, 14336-Core Waferscale Processor[C]// 2021 58th ACM/IEEE Design Automation Conference (DAC),San Francisco,CA,USA, 2021:1183-1188.DOI:10.1109/DAC18074.2021.9586194. 本文引用 [3]

[11]

VIVET

, E

GUTHMULLER

, Y

THONNART

, et al. A 220gops 96-core processor with 6 chiplets 3dstacked on an active interposer offering 0.6ns/mm latency,3tb/s/mm2 inter-chiplet interconnects and 156mw/mm2@82%-peak-efficiency DCDC converters[C]//IEEE International Solid-State Circuits Conference (ISSCC), 2020:46-48.

本文引用 [1]

[12]	MA X, WANG Y, WANG Y, et al. Survey on chiplets:interface,interconnect and integration methodology[C]// CCF Trans. HPC 4, 2022:43-52. 本文引用 [1]

[13]

ZHU

. COMB-MCM:Computing-on-Memory-Boundary NN Processor with Bipolar Bitwise Sparsity Optimization for Scalable Multi-Chiplet-Module Edge Machine Learning[C]// 2022 IEEE International Solid-State Circuits Conference (ISSCC),San Francisco,CA,USA, 2022:1-3.DOI:10.1109/ISSCC42614.2022.9731657.

本文引用 [3]

[14]	R HWANG, T KIM, Y KWON, et al. Centaur:A chiplet-based,hybrid sparse-dense accelerator for personalized recommendations[C]// ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), 2020:968-981. 本文引用 [2]

[15]	Y S SHAO, J CLEMONS, R VENKATESAN, et al. Simba:Scaling deep-learning inference with multi-chip-module-based architecture[C]// IEEE/ACM International Symposium on Microarchitecture (MICRO), 2019:14-27. 本文引用 [3]

[16]	B ZIMMER, R VENKATESAN, Y S SHAO, et al. A 0.11 pj/op, 0.32-128 tops,scalable multi-chip-module-based deep neural network accelerator with ground-reference signaling in 16nm[C]// IEEE Symposium on VLSI Circuits, 2019:300. 本文引用 [1]

[17]	BLYTHE DAVID. Xehpc ponte vecchio[C]//2021 IEEE Hot Chips 33 Symposium (HCS), IEEE Computer Society, 2021. 本文引用 [1]

[18]	A ARUNKUMAR, E BOLOTIN, B Y CHO, et al. MCM-GPU:multi-chip-module gpus for continued performance scalability[C]// ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), 2017:320-332. 本文引用 [1]

[19]	J YIN, Z LIN, O KAYIRAN, et al. Modular routing design for chiplet-based systems[C]// ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), 2018:726-738. 本文引用 [1]

[20]	D GREENHILL, R HO, D M. LEWIS, et al. A 14nm 1ghz FPGA with 2.5d transceiver integration[C]// IEEE International Solid-State Circuits Conference (ISSCC), 2017:54-55. 本文引用 [1]

[21]	T CHOU. NetFlex:A 22nm Multi-Chiplet Perception Accelerator in High-Density Fan-Out Wafer-Level Packaging[C]// 2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), Honolulu,HI,USA, 2022:208-209.doi:10.1109/VLSITechnologyandCir46769.2022.9830249. 本文引用 [2]

[22]

D ROTARU

, W

TANG

, D

RAHUL

, et al. Design and Development of High Density Fan-Out Wafer Level Package (HD-FOWLP) for Deep Neural Network (DNN) Chiplet Accelerators using Advanced Interface Bus (AIB)[C]// 2021 IEEE 71st Electronic Components and Technology Conference (ECTC),San Diego,CA,USA, 2021:1258-1263.doi:10.1109/ECTC32696.2021.00204.

本文引用 [2]

[23]	F ZARUBA, F SCHUIKI, L BENINI. Manticore:A 4096-Core RISC-V Chiplet Architecture for Ultraefficient Floating-Point Computing[J]. IEEE Micro, 2021, 41(2):36-42.DOI:10.1109/MM.2020.3045564. https://ieeexplore.ieee.org/document/9296802/ 本文引用 [2]

[24]	S ZHU, M MIAO, Z ZHANG, et al. Research on A Chiplet-based DSA (Domain-Specific Architectures) Scalable Convolutional Acceleration Architecture[C]// 2022 23rd International Conference on Electronic Packaging Technology (ICEPT),Dalian,China, 2022:1-6.doi:10.1109/ICEPT56209.2022.9873177. 本文引用 [2]

[25]	J LAN, V P NAMBIAR, R SABAPATHY, et al. Chiplet-based Architecture Design for Multi-Core Neuromorphic Processor[C]// 2021 IEEE 23rd Electronics Packaging Technology Conference (EPTC),Singapore,Singapore, 2021:410-412.doi:10.1109/EPTC53413.2021.9663898. 本文引用 [2]

[26]	G KRISHNAN, S K MANDAL, C CHAKRABARTI, et al. System-Level Benchmarking of Chiplet-based IMC Architectures for Deep Neural Network Acceleration[C]// 2021 IEEE 14th International Conference on ASIC (ASICON),Kunming,China, 2021:1-4.doi:10.1109/ASICON52560.2021.9620238. 本文引用 [1]

[27]	FU Y, BOLOTIN E, CHATTERJEE N, et al. GPU domain specialization via composable on-package architecture[J]. ACM Transactions on Architecture and Code Optimization (TACO), 2021, 19(1):1-23. 本文引用 [1]

[28]	T BURD. Zen3:The AMD 2nd-Generation 7nm x86-64 Microprocessor Core[C]// 2022 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco,CA,USA, 2022:1-3.doi:10.1109/ISSCC42614.2022.9731678. 本文引用 [2]

[29]	S NAFFZIGER, K LEPAK, M PARASCHOU, et al. 2.2 AMD Chiplet Architecture for High-Performance Server and Desktop Products[C]// 2020 IEEE International Solid-State Circuits Conference (ISSCC),San Francisco,CA,USA, 2020:44-45.doi:10.1109/ISSCC19947.2020.9063103. 本文引用 [2]

[30]	CHROMCZAK J, WHEELER M, CHIASSON C, et al. Architectural enhancements in intel agilex fpgas[C]// Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020:140-149. 本文引用 [2]

[31]	G H LOH, S NAFFZIGER, K LEPAK. Understanding Chiplets Today to Anticipate Future Integration Opportunities and Limits[C]// 2021 Design,Automation & Test in Europe Conference & Exhibition (DATE),Grenoble,France, 2021:142-145.doi:10.23919/DATE51398.2021.9474021. 本文引用 [1]

[32]	P MAJUMDER, S KIM, J HUANG, et al. Remote control:A simple deadlock avoidance scheme for modular systems-on-chip[J]. IEEE Transactions on Computers, 2020. 本文引用 [1]

[33]	M PARASAR, H FARROKHBAKHT, N E JERGER, et al. Drain:Deadlock removal for arbitrary irregular networks[C]// 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).IEEE, 2020:447-460. 本文引用 [1]

[34]	A RAMRAKHYANI, P V GRATZ, T KRISHNA. Synchronized progress in interconnection networks (spin):A new theory for deadlock freedom[C]// in 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).IEEE, 2018:699-711. 本文引用 [1]

[35]	A RAMRAKHYANI, T KRISHNA. Static bubble:A framework for deadlock-free irregular on-chip topologies[C]// 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).IEEE, 2017:253-264. 本文引用 [1]

[36]	Y WU, L WANG, X WANG, et al. A deflection-based deadlock recovery framework to achieve high throughput for faulty NoCs[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020(99):1. 本文引用 [1]

[37]	A GRAENING, S PAL, P GUPTA. Chiplets:How Small is too Small?[C]//2023 60th ACM/IEEE Design Automation Conference (DAC), San Francisco,CA,USA, 2023:1-6.DOI:10.1109/DAC56929.2023.10247947. 本文引用 [1]

[38]	M A KABIR, Y PENG. Chiplet-Package Co-Design For 2.5D Systems Using Standard ASIC CAD Tools[C]// 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC),Beijing,China, 2020:351-356.DOI:10.1109/ASPDAC47756.2020.9045734. 本文引用 [1]

[39]	J KIM. Architecture,Chip,and Package Codesign Flow for InterposerBased 2.5-D Chiplet Integration Enabling Heterogeneous IP Reuse[C]// IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2020, 28(11):2424-2437.DOI:10.1109/TVLSI.2020.3015494. 本文引用 [1]

[40]	Z TAN, H CAI, R DONG, et al. NN-Baton:DNN Workload Orchestration and Chiplet Granularity Exploration for Multichip Accelerators[C]// 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA),Valencia,Spain, 2021:1013-1026.DOI:10.1109/ISCA52012.2021.00083. 本文引用 [1]

[41]	P EHRETT, T AUSTIN, V BERTACCO. Chopin:Composing Cost-Effective Custom Chips with Algorithmic Chiplets[C]// 2021 IEEE 39th International Conference on Computer Design (ICCD),Storrs,CT,USA, 2021:395-399.DOI:10.1109/ICCD53106.2021.00069. 本文引用 [1]

[42]	S PAL, D PETRISKO, R KUMAR, et al. Design Space Exploration for Chiplet-Assembly-Based Processors[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2010, 28(4):1062-1073.DOI:10.1109/TVLSI.2020.2968904. https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=92 本文引用 [1]

[43]	X HAO, Z DING, J YIN, et al. Monad: Towards Cost-Effective Specialization for Chiplet-Based Spatial Accelerators[C]// 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD),San Francisco,CA,USA, 2023:1-9.DOI:10.1109/ICCAD57390.2023.10323880. 本文引用 [1]

[44]	H ZHENG, K WANG, A LOURI. A Versatile and Flexible Chiplet-based System Design for Heterogeneous Manycore Architectures[C]// 2020 57th ACM/IEEE Design Automation Conference (DAC),San Francisco,CA,USA, 2020:1-6.DOI:10.1109/DAC18072.2020.9218654. 本文引用 [2]

[45]	PANO V, KUTTAPPA R, TASKIN B. 2019. 3D NoCs with active interposer for multi-die systems[C]//Proceedings of the 13th IEEE/ACM International Symposium on Networks-on-Chip.Association for Computing Machinery,New York,New York:Article 14. 本文引用 [1]

[46]	THONNART Y, BERNABé S, CHARBONNIER J, et al. POPSTAR: a robust modular optical NoC architecture for chiplet-based 3D integrated systems[C]// Proceedings of the 23rd Conference on Design,Automation and Test in Europe.EDA Consortium,Grenoble,France, 2020:1456-1461. 本文引用 [1]

[47]	NARAYAN A, THONNART Y, VIVET P, et al. System-level Evaluation of Chip-Scale Silicon Photonic Networks for Emerging Data-Intensive Applications[C]// 2020 Design,Automation & Test in Europe Conference & Exhibition (DATE), 2020:1444-1449.DOI:10.23919/DATE48585.2020.9116496. 本文引用 [1]

[48]	WANG T, FENG F, XIANG S, et al. Application Defined On-chip Networks for Heterogeneous Chiplets: An Implementation Perspective[C]// 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2022:1198-1210.DOI:10.1109/HPCA53966.2022.00091. 本文引用 [1]

[49]	KANNAN A, JERGER N E, LOH G H. Enabling interposer-based disintegration of multi-core processors[C]// 2015 48th Annual IEEE/ACMInternational Symposium on Microarchitecture (MICRO), 2015:546-558.DOI:10.1145/2830772.2830808. 本文引用 [1]

[50]	BESTA M, HOEFLER T. Slim fly:a cost effective low-diameter network topology[C]// Proceedings of the International Conference for High Performance Computing,Networking, Storage and Analysis.IEEE Press;New Orleans,Louisana, 2014:348-359. 本文引用 [1]

[51]	BHARADWAJ S, YIN J, BECKMANN B, et al. Kite:a family of heterogeneous interposer topologies enabled via accurate interconnect modeling[C]// Proceedings of the 57th ACM/EDAC/IEEE Design Automation Conference. IEEE Press;Virtual Event,USA:Article 144. 本文引用 [1]

[52]	KADOMOTO J, IRIE H, SAKAI S. Design of Shape-Changeable Chiplet-Based Computers Using an Inductively Coupled Wireless Bus Interface[C]// 2020 IEEE 38th International Conference on Computer Design (ICCD), 2020:589-596.DOI:10.1109/ICCD50377.2020.00103. 本文引用 [1]

[53]	KADOMOTO J, MITSUNO S, IRIE H, et al. An Inductively Coupled Wireless Bus for Chiplet-Based Systems[C]// 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC), 2020:9-10. DOI:10.1109/ASP-DAC47756.2020.9045184. 本文引用 [1]

[54]	KADOMOTO J, IRIE H, SAKAI S. WiXI:An Inter-Chip Wireless Bus Interface for Shape-Changeable Chiplet-Based Computers[C]// 2019 IEEE 37th International Conference on Computer Design (ICCD). 2019:100-108. DOI:10.1109/ICCD46524.2019.00021. 本文引用 [1]

[55]	CHEN C H O, PARK S, KRISHNA T, et al. SMART:A single-cycle reconfigurable NoC for SoC applications[C]// 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE). 2013:338-343. DOI:10.7873/DATE.2013.080. 本文引用 [1]

[56]	FARUQUE M A A, EBI T, HENKEL J. Configurable links for runtime adaptive on-chip communication[C]// 2009 Design, Automation & Test in Europe Conference & Exhibition. 2009:256-261.DOI:10.1109/DATE.2009.5090667. 本文引用 [1]

[57]	PARIKH R, DAS R, BERTACCO V. Power-aware NoCs through routing and topology reconfiguration[C]// 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC), 2014:1-6. 本文引用 [1]

[58]	WANG M, WANG Y, LIU C, et al. Network-on-Interposer Design for Agile Neural-Network Processor Chip Customization[C]// 2021 58th ACM/IEEE Design Automation Conference (DAC). 2021:49-54.DOI:10.1109/DAC18074.2021.9586261. 本文引用 [1]

[59]	GUIRADO R, KWON H, ABADAL S, et al. Dataflow-Architecture Co-Design for 2.5D DNN Accelerators using Wireless Network-on-Package[C]// Proceedings of the 26th Asia and South Pacific Design Automation Conference.Association for Computing Machinery,Tokyo,Japan, 2021:806-812. 本文引用 [1]

[60]	LI Y, LOURI A, KARANTH A. SPRINT:A High-Performance, Energy-Efficient,and Scalable Chiplet-Based Accelerator With Photonic Interconnects for CNN Inference[J]. IEEE Transactions on Parallel and Distributed Systems, 2022(33):2332-2345. 本文引用 [1]

[61]	LI Y, LOURI A, KARANTH A. SPACX:Silicon Photonics-based Scalable Chiplet Accelerator for DNN Inference[C]// 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2022:831-845.DOI:10.1109/HPCA53966.2022.00066. 本文引用 [1]

[62]

FARIBORZ

, YOO

S J B

. High Throughput Memory with Silicon Photonics in Chiplet-based Architectures for Irregular Workloads[C]// 2022 27th OptoElectronics and Communications Conference (OECC) and 2022 International Conference on Photonics in Switching and Computing (PSC), 2022:1-3.DOI:10.23919/OECC/PSC53152.2022.9849864.

本文引用 [1]

[63]	TAHERI E, PASRICHA S, NIKDAST M. ReSiPI:A Reconfigurable Silicon-Photonic 2.5D Chiplet Network with PCMs for Energy-Efficient Interposer Communication[C]// 2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD).IEEE, 2022:1-9.DOI:10.1145/3508352.3549432. 本文引用 [1]

[64]	DEMIR Y, PAN Y, SONG S, et al. Galaxy:a high-performance energy-efficient multi-chip architecture using photonic interconnects[C]// Proceedings of the 28th ACM international conference on Supercomputing. Association for Computing Machinery,Munich,Germany, 2014:303-312. 本文引用 [1]

[65]	PAN Y, KUMAR P, KIM J, et al. Firefly: Illuminating future network-on-chip with nanophotonics[C]// International Symposium on Computer Architecture.DBLP, 2009:429-440.DOI:10.1145/1555815.1555808. 本文引用 [1]

[66]	GRANI P, PROIETTI R, AKELLA V, et al. Design and Evaluation of AWGR-Based Photonic NoC Architectures for 2.5D Integrated High Performance Computing Systems[C]// 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2017:289-300. DOI:10.1109/HPCA.2017.17. 本文引用 [1]