tech

How to plan chip layout in the Chiplet era

Automatically mitigating thermal issues has become a top priority in heterogeneous designs.

3D-ICs and heterogeneous chips will significantly alter physical layout tools, where the placement of Chiplets and the routing of signals will greatly impact the overall system's performance and reliability.

EDA vendors are acutely aware of these issues and are actively seeking solutions. The biggest challenge facing 3D-ICs is heat dissipation. Logic typically generates the most heat, and stacking logic chips on top of other logic chips requires a method for heat dissipation. In planar SoCs, this usually relies on heat sinks or substrates to handle it. However, in 3D-ICs, the substrate must be thinned to minimize the distance signals must travel, which reduces the substrate's heat transfer capability. Moreover, heat may become trapped between chips, making heat sinks no longer an option. The solution to this problem is to carefully configure different layers so that heat is dispersed throughout the entire chip or confined to areas where it can be effectively removed, which needs to be built into automated tools.

Alphawave Semi Chief Technology Officer Tony Chan Carusone said: "The transition to the Chiplet design paradigm will affect modern layout and routing design processes, requiring the optimization of logical partitioning between chips. This means that the layout and routing design process for systems based on chips must consider multi-chip integration, the potential of heterogeneous technologies, and manage the complexity of high-density inter-chip interconnections. This will require an understanding of the possibilities and limitations offered by different manufacturing and packaging technologies."

Advertisement

After decades of discussions and PowerPoint presentations on stacked chips, the chip industry has run out of options. Chip manufacturers are already designing logic chip stacks and memory chip stacks, and as the cost of planar scaling continues to increase, relying on some type of advanced packaging and Chiplet-based system design is the best choice for improving performance, especially for artificial intelligence and other high-performance computing applications.

In fact, Yole predicts that from 2025 onwards, most server chips will be built using Chiplets, and more than 50% of volume client PCs will use Chiplets. These figures increase the urgency of adapting tools and workflow demands.

Floorplanning, placement, clocking, and routing are the four main stages of the layout and routing process. Floorplanning explores the early stages of the process, where designers place large functional modules in different areas of the chip, determine connectivity, and decide which module should be placed next to which. At this stage, modules have boundaries that divide the entire chip area into rough partitions. Standard cells are then placed within each boundary as defined modules. These are small library cells that comply with the rules set forth in the foundry's design review manual. They are then wired to each other through interconnections based on local connections. Overall, the floorplanning step contains an abstract view of the top-level connections.In actual layout, you are doing a detailed layout of all standard cells and macros," said Vinay Patwardhan, Director of Cadence Product Management Group. "Routing is the next step to connect them. With each subsequent phase, more and more information is added to the design."

Basic decisions about materials, such as whether to use copper or optical interconnects, are signed before the early exploration phase or system design phase, even before floor planning.

Although these steps are still carried out in the traditional order, the game has shifted from a classic chess game to three-dimensional chess. "It's a bit complicated now," said Kenneth Larsen, Senior Director of Synopsys 3D-IC Product Management. "When we talk about 2.5/3D and the transition to multi-chip design, the distance between chips is very close, which brings many new challenges. When we build systems with multiple silicon chips, they are very closely connected. They may be stacked and will affect each other. One issue is power supply to the system. Another issue is thermal, because the distance is very close. The thermal issue is becoming a first-order effect, and the position of parts placed in the layout plan may affect the heat or temperature escape in the design."

Now, all of this happens in three-dimensional space, and each dimension must be considered in the design. Patwardhan said, "Now, you not only have to consider plane checks, but also the interaction between the placement of objects and the top and bottom layers, instead of just plane checks. In 3D-IC stacked chip design, the lower layer is often at the top of the advanced packaging, it communicates with the HBM or other storage elements next to it, and also communicates with the objects on top of it. You need to observe the coupling effect from the top chip in the z-dimension, observe the increased resistivity, and also observe the timing path across the chip with synchronous clocks. The close communication between the two chips must be modeled early in the placement process, and the same is true when planning the inter-chip connection process."

There is another important aspect to consider here. "Because these are stacked metal connections, the high conductivity between metal layers will produce a chimney effect, and there may be a very high heat dissipation in high power density areas," Patwardhan said. "You may have met the timing or power requirements, but you may not have considered thermal as a first-order effect, and now you must."

Thermal effects

There is an increasing awareness of the importance of thermal effects (especially thermal crosstalk in 3D structures), which affects the way design teams work in this process, breaking down the barriers between specialties. "Thermal issues have always been a challenge," Larsen said. "Before, you threw it to the expert, and he would respond, 'We have a thermal issue, you need to limit the chip.' But now, we introduce these multi-physical effects simulations earlier in the design process, earlier than 10 years ago."

Kai-Yuan (Kevin) Chao, Director of R&D at Siemens EDA, agreed. "Thermal planning in physical design is crucial because most high-performance CPUs have acceleration and power throttling features to manage the hard limit of transistor junction temperature, thereby ensuring chip reliability. In short, the significance of using a floor plan for the worst-case power watt thermal simulation in a fixed state is not as meaningful as simulating target application workloads in multiple segmented markets, which run on different cores and memory, in various combinations under the cooling use of the product."

Reducing the throttling margin between thermal sensors is very important for measuring hotspots caused by the most critical workloads. This determines the distance between different processing elements, and/or how to divide and prioritize various operations.Chao pointed out: "Due to the continuous duration of the upper and lower limits of voltage/frequency affecting performance and computational throughput, transient thermal power ramp modeling and internal simulation adjustment of temperature-sensitive parameters (such as leakage) are also needed." Integrated voltage regulator inductors and wiring for packaging design and cooling design systems also require early power and thermal maps from chip design to coordinate assembly and product release. Therefore, from the RTL pre-architecture phase to the final pre-taping layout phase, the physical floor plan (including I/O) and consistent power watt convergence are also important.

Figure 1: The interaction between layout planning and thermal management. Source: Synopsys

Even before the designer delves into the complex multi-physical fields, layout planning can indicate where there may be thermal issues. Andy Nightingale, Vice President of Arteris Product Management and Marketing, said: "Once we see the layout view on the screen and start the NoC design, we can see where there are congestion points. These high-density connections can be seen as hot spots in the design."

All of this highlights why EDA companies encourage users to shift left. Patwardhan said: "If you are doing signal integrity-aware wiring, you must model it early in the process. How good your model is will determine how accurate you are at the end of the design phase. We must do some additional sign-off checks or thermal analysis checks in the early stages of the process, as well as signal and power integrity analysis. Therefore, if we are talking about multi-chip layout at the cell level, whether they are 2.5D configurations or stacked chip configurations, many system-level sign-off checks must be modeled early in the implementation process. We must come up with new abstraction methods, some new methods to let the layout environment handle multiple objects, optimize more parameters at once, and do well enough so that each design does not need to be reopened when there is an engineering change order (ECO). It is not practical to include everything too early from a runtime perspective or a design method perspective, but we can do enough work early on to ensure that iterations are reduced after the first pass."

Looking forward to the future of AI

There is a consensus that EDA is already a kind of AI because it has always been an auxiliary tool for human designers based on algorithms. However, the tools are still evolving. EDA suppliers are now considering expansions, such as providing generative AI copilots for tools, integrating more multi-physical simulations, and developing design engines specifically for handling multi-chip and multi-dimensional structures.

The hope is that artificial intelligence can bring predictive intelligence to traditional layout wiring. "We are already good at integrating advanced algorithms into NoC design for various optimizations," Nightingale said. "The next step is to predict and optimize plane planning and layout wiring results based on historical data (even possibly real-time analysis). We also need to work closely with ecosystem partners across different fields to make more efforts to keep the design within the given constraints."

The academic community is also providing help. MIT has just announced a new AI-based method, named Virtual Node Graph Neural Network (VGNN), which uses virtual nodes to represent phonons to accelerate the prediction of material thermal properties. The authors of the paper claim that running VGNN on a personal computer can calculate the phonon dispersion relations of thousands of materials in just a few seconds.Conclusion

Today's chiplet, system, and packaging designers are facing an increasing diversity of technologies and the requirement for system-level collaborative optimization. "The substrates are larger and more complex, including interposers and silicon interposers embedded in the substrate, which require EDA routers to handle the rapidly growing interconnections between different material layers, and adopt specific design rules and high-speed electrical and thermo-mechanical constraints to improve productivity," said Chao from Siemens. "In addition, special wiring requirements demand EDA innovation, such as substrate capacitors and optical components. Fine-pitch hybrid bonding enables single-clock-cycle interconnects to perform cell-level timing and I/O layout in vertical cross-chip 3D planning. Nevertheless, increasing the number of transistors in the chips within the package requires more efficient power delivery and heat dissipation. For example, TSMC has added IVR to its future HPC/AI 3D-IC configurations. Integrated cooling solutions, including liquid cooling, have been co-optimized in NVIDIA's new products."

Power and heat dissipation are increasingly severe challenges. "In addition to the back-side power delivery network introduced to meet thermal design requirements below 2nm, thermal-aware layout and planning requirements (such as co-design of multi-chip module microchannel cooling) may re-emerge if the product design includes integrated packaging/system liquid cooling," Chao continued. "In the collaborative development process owned by multiple stakeholders, early physical design with multi-physics awareness will be very beneficial, as unrealistic assumptions in the post-verification chiplet assembly stage may lead to very costly repair costs."

There is still a long way to go before optimizing the 3D-IC design process. "We are just at the beginning of this journey," said Patwardhan from Cadence. "We have developed some quite good algorithms that can perform 3D layout, 3D floor planning, thermal-aware 3D floor planning, and placement simultaneously. But now everyone in the design community and the EDA community is very conservative, leaving extra margin for stacked chip design, because we are in the stage of process development and early test chips. In a very short time, we will develop optimized processes from our learning, just as we rapidly developed during the era of finFET and GAA transistors. Now, stacked chips just add the extra challenge of an additional dimension. It is only a matter of time before we can propose optimized and fully automated 3D layout and routing processes for complex 3D-IC designs."

*Disclaimer: This article is the original creation of the author. The content of the article is his personal opinion. We republish it only for sharing and discussion, and do not necessarily endorse or agree with it. If you have any objections, please contact the backend.

Leave a Reply