Design planning for large SoC implemention at 40nm - Part 3 bhupesh dasila - August 09, 2013
[Part 1 explores the process technology to learn its capabilities and limitations, including evaluating the technology libraries, determining the implementation tools and flows, and capturing the SoC requirements. Part 2 covers comprehensive planning for complex designs at lower geometry.] Floorplanning and PnR A thorough exercise during physical architecture is the foundation for an efficient floorplan. It helps in reducing the overall turnaround time of the physical design phase. The broader prospective of the floorplan should be performed during the physical architecture phase, and the actual floorplaning phase should address the finer details of the floorplan, which impacts the physical design’s QoR. Floorplanning guidelines The seed for a floorplan primarily comes from physical architecture, die size-power estimation exercise and the technology. When creating a floorplan, it’s important to consider some basic characteristics of the process technology. The designer should have explored the technology enough in the context of metal stack and metal configuration. Also the designer should have gained ample experience about the availability of vertical and horizontal routing resources and their requirements for the design as per the physical architecture. At any level, creating “non-preferred” routing (i.e. not using the preferred routing direction for that level) is not recommended. In the case of a channel-based floorplan, when placing blocks, four-way intersections in top-level channels should be avoided; “T” intersections create much less congestion. This consideration can be critical in leaving adequate space for routing channels, especially if there is not much opportunity for over-the-cell routing. Using fly lines can help determine optimal placement and orientation, but when the fly lines are numerous enough to “paint the area” between blocks, designers must rely on their best judgment for block placement, and later evaluate the results for possible modification. Once blocks are placed, block-level pins may be placed. It is necessary to determine the correct layer for the pins and spread the pins out to reduce congestion. Placing pins in corners where routing access is limited should be avoided; instead, multiple pin layers for less congestion should be used. It is worth spending the time needed to place block pins manually so that block-to-block routs are straight, have minimum distance between them and are cross-talk immune. This will help immensely down the line during full-chip timing closure.
While placing the hard macros, like PLL or other analog blocks, it is important to adhere to the guidelines provided by the IP vendor. Placing cells within the perimeter of hard macros is not recommended. To keep from blocking access to signal pins, it is a good idea to avoid placing cells under power straps unless the straps are on high metal layers (i.e., higher than metal2). Density constraints or placement of blockage arrays may be used to reduce congestion since these strategies will help spread cells over a larger area, thereby reducing the routing requirements in that area. In any physical design work, it is essential to understand the requirements of the target process technology. Lower utilization would result in a larger chip, but the chip is less likely to have problems in routing. For example, most processes now require the insertion of holes in large metal areas in a step known as “slotting” or “cheesing.” Slotting relieves stress-related problems in the metal due to thermal effects, but may change the metal’s current-carrying characteristics. It is imperative to consult the design rule document for this and many other physical variables. For technology nodes below 40nm, there are other important rules that must be considered while creating the floorplan. For example, the TCD structures are placed to monitor the various processes on the die. These components are driven from the foundry. The TCD structure is required to be placed at regular intervals throughout the chip. It could be a significant size, which may need to be allocated on the die early on. This may impact a floorplan of a block at a later stage if this fact is not considered early on, such as in a block packed with memories. Similarly, for core ESD protection, it is recommended to place ESD clamps at regular intervals on the die. These are a few of the mandatory components that must be considered in the early stages of full-chip floorplanning, and must be considered for block-level floorplanning. The RTL should be examined for logical models to break out into hierarchical physical elements. If there are multiple instances of any logical hierarchical element, these elements can be grouped to form one physical element. It is easier to floorplan with same-size blocks, so small blocks should be grouped and large blocks divided when appropriate. Working with “medium-sized” blocks is typically best; six to twelve roughly equivalent-sized blocks is a reasonable target. Typically, floorplans should be started with I/Os at the periphery (depending on the package design). It is crucial to determine the total number of I/Os required, and their placements. The physical designer must calculate the number of core Vdd and Vss pads in the I/O ring for ESD protection (in the case of a flip chip). In the case of a wire-bond chip, the core Vdd and Vss must be calculated through a thorough analysis after considering the chip power requirements and IR drop. The SSO ratio must also be considered for calculation of VddIO and VssIO; care must be taken for high-speed interfaces. Apart from this, there could be some custom requirements for I/O planning, such as the placement of PLL in the I/O rings or placement of SerDes IP or DDR IP, which is inclusive of bumps in it. During and after the I/O planning, it is very important that the bump or bond pad DRC and LVS is clean before a floorplan can be considered final. Hence, I/O planning is yet another seed for floorplanning. It is best to place parts of the design that have special layout requirements (e.g., memories, analog circuitry, PLLs, logic that works with a double-speed clock, blocks that require a different voltage, any exceptionally large blocks, etc.) first to ensure that their needs are accommodated. Design blocks with special needs must be understood at the beginning; for example, flash memory has a high-voltage programming input that must be within a certain distance of an I/O pin, so it is best to place this element first. If there are two or more large blocks or other features that make a reasonable floorplan impossible, it may be necessary to increase the die size or re-arrange I/Os. Finding this problem early in the flow
enables and easier business decision about whether the chip will be financially viable with a larger, more expensive die. If any of the large blocks are soft (synthesizable) IP or otherwise available as RTL, it might be possible to avoid going to a larger die by repartitioning that block into smaller pieces. Another key aspect in I/O planning is to consider the scan I/O requirement. Here, the physical designer must engage with the DFT architect. The scan architecture could also play a significant role in I/O planning and floorplanning. Block-level floorplanning Block-level floorplanning A good floorplan of the blocks is crucial to faster timing closure. The design knowledge and the data flow understanding help immensely in creating an optimum floorplan, which leads to faster convergence of the block. So, the physical designer must engage with the RTL designer. Establish up front whether the block floorplan needs to be pushed down from the top or the other way around. Typically, the block floorplan is pushed down from top, but in the case of some critical blocks, the top floorplan may have to adjust to the block floorplan requirement.
● ●
●
●
Initial synthesis should be run to determine the total area of the cells in a block. Determining the area of a block beyond the area of its cells is a factor of utilization. Utilization varies depending on the library, technology, and characteristics of the design implemented; for the typical library, the sweet spot is usually about 70% utilization. An unusually high percentage of s, or hard IP, will increase this number; large numbers of multiplexers or other small, pindense cells will decrease it. The 70% utilization from a synthesized netlist may not be optimal for every block. As the growth of every block is dependent upon various factors, and more importantly the number of memories and associated MBIST logic post DFT insertion. A block without any memory can have a starting utilization of as high as 75%, and a block packed with memories may have a starting utilization of a low as 60%. Therefore, it’s best to arrive at the best utilization number after taking a block through an entire rough PD process in the early phase. Care must be taken while placing the memories. For example, there is poly orientation rule from the foundry at technology nodes below 40nm, which does not allow memory orientation to be at 90 degrees from one another (i.e. either memories should be aligned to X axis or should be aligned to Y axis). Realizing this at a later stage could be catastrophic. Hence the foundry rules must be understood and considered right from the beginning of physical design activity.
PnR consideration The physical designer must not rely on the push button PnR approach. It’s better to take stock of the situation at every PD stage. The physical design optimization and implementation go through various stages, like placement, placement optimization, CTS, post CTS optimization, hold fixing, routing and routing optimization, crosstalk repair and chip finishing. The best approach in physical design is to be able to push down the physical information from the physical synthesis environment to the PnR environment. However, it may not be possible if the physical synthesis and PnR tool vendors are different. In such cases, the physical designer must build a margin of miscorrelation by exercising some critical benchmark blocks through synthesis to the PnR process. Typically every block would be processed through the PD flow a couple of times before going in for the final PD implementation. Every design could impose its own challenges and could have its own
customization for better optimization. Some designs with higher pin density and more complex cells can impose major routing challenges. Some designs can have a high combo vs. sequential cell ratio, which is more challenging in PD. In general, a ratio beyond eight could be highly challenging to converge. Some designs may have a requirement of creating soft regions. If there are some critical signal nets, which need higher attention, such as adopting NDRs, like clock nets, then these must be identified. All of these issues must be identified by the physical designer prior to final implementation. One should also consider incorporating the process technology-related requirements early on, like well taps and end caps to avoid physical DRC-related issues later on. The routing resource allocation and assessment for signal nets must be done when the power structure for the chip is created. This is where the metal stack plays a key role. A physical designer would always be comfortable with a higher metal stack; but the metal stack decision has to align with business decisions. The PD implementation flow from the RTL to PD environment could vary depending on how the MBIST logic insertion is done (i.e. on RTL or in the netlist). It’s better if the physical synthesis is done post DFT insertion. The BIST logic-to-memory path at a functional frequency may require physical synthesis optimization. The synthesis, MBIST implementation and PD must collaborate to achieve better PD implementation of a block. One of the crucial seeds to PD is the constraints. The STA and PD engineer must streamline the constraints before starting the PnR activity. The PD engineer must figure out what margins are to be kept at different stages of PnN for better optimization. One important thing to assess is where you would want to fix the hold. Encounter does a good job in fixing hold after the post CTS stage, but care must be taken regarding the number of hold buffers inserted. If there is a huge difference between the estimated routing delay and the actual routing delay (after detailed routing), then the PD tool might find the design environment at the post CTS stage very pessimistic for hold and may end up adding more hold buffers than needed. At the same time, it may not be effective to wait until the crosstalk fix stage to fix the hold violations. However, if the design is sensitive toward power, then it may be worth it to assess the hold violation fixing through timing ECOs. From the chip-level prospective, at the beginning of PnR, it is advisable to figure out the exact number of interactions between the blocks (i.e. the interface pins and the frequency of every interface). This helps in allocating the routing resources or customizing the critical interfaces. The PD engineer must consider the keep-out region for around the block to stay away from crosstalk issues from within the block. PDV (physical DRCs and LVS) is traditionally planned at a very late stage of SoC implementation, but it is highly recommended to look at this aspect early on. With the increased complexity in the designs and stringent foundry rules at 40nm and below, PDV is no more an isolated activity left toward the end. In some cases it may also have an impact on the floorplan. It is mandatory to clean the Bump or Bond pad DRC before freezing the floorplan and I/O plan. The physical designer must look at the overall DRC and LVS situation thoroughly with the pre-final netlist so that there are no surprises seen once the design timing is closed with the final netlist. The block-level physical designer must try to clean the DRC and LVS on their respective blocks before they are assembled with the top level. The designer should assess the overall PDV situation, and should find solutions to every DRC at the pre-final stage so that toward the end, with the huge schedule pressure, the DRC and LVS cycle can be reduced and made predictable. Hence, PDV must be incorporated in the prefinal stage of the PD schedule. PD integration
PD integration An SoC typically has various components on one die. It needs thorough planning, an extensive manual and automatic verification to ensure that each and every component functions as desired in the SoC environment. The PD engineer must thoroughly study the layout integration guidelines of each and every standard IP being integrated. The ESD, latch-up requirements and overall electrical rules must be well understood. The designer should also be clear on the multiple power regions and their isolation requirements. At a higher level, the ESD and latch-up rules appear simple, but their planning and verification become very exhaustive. Therefore, it needs to be considered during the floorplanning stage. ESD is the transient discharge of static charge or current through its component when the chip comes in with any charged body. If not designed carefully, a combination of these parasitic components might trigger current flow in the substrate and result in shorting of power and ground resulting in a huge current flow through its component, eventually damaging the device. This is called latch up. ESD events and latch up might result in permanent damage of an ASIC, so it is imperative to take preventive measures while deg the chip. Some of the considerations for EAD and latch up are: ●
●
●
●
●
●
●
●
I/Os are the only external interfaces susceptible to ESD; hence the standalone I/O should be protected. The ESD diode provides the protection for instant current surge and on to the “low resistance” network. The designer should ensure that the power I/Os have ESD protection. An ESD design should be such that for any “external interface point” there should be a lowestpossible resistance path for ESD current. Total bus resistance also includes resistance from Bump to the I/O, or Bond pad to the I/O, so, the designer should ensure there are no weak connections from the source. The designer must design for a common ESD path. Usually it’s the core VSS since it has the lowest resistance mesh and it connects to a big plane on the package. Every power domain needs to be understood for its ESD scheme and needs to come up with an ESD plan that will protect the entire design. Enough ESD clamps should be added in each of the domains. For latch-up protection, reduce the substrate resistance so that voltage drop does not build to cause latch-up. Increase the substrate resistance so there is no stray current in the substrate to cause latch up. Use guard rings/guard bands to isolate the regions with a potential chance of latch up. The designer must check that all the standard IPs have substrate s. Standard cells may or may not have substrate s. Per foundry rule, “TAP Cells” must be added if the substrate s are not built in. Memories typically have substrate s. The designer should check that all the standard I/O has the built-in substrate s. LU protection is important when I/Os are operating at different voltages, which are placed closer together. Add substrate s (with Pdiff) or guard-bands (with ndiff) or a combination of both to avoid LU. For latch-up protection on IPs, designers must consider a careful placement of memories, complex IP’s and I/O domains, while considering voltage levels and proximity of the sensitive circuits to the high-power/switching circuits.
Overall electrical rule checks require identifying a lot of rules and their verification. This should be done both manually and automatically. In the early phase of implementation, PD engineers should consider identifying and automating those checks. Clock planning
The foundation of the clock network starts right at the architecture stage. A lot of planning must go into architecting the clock before it reaches the clock tree synthesis stage at the physical design level. Appropriate clock definitions and constraints should be the seed for the clock network layout. A sufficient amount of time must be spent creating a clock specification file. For example, the generated clock tree constraints file may not contain all of the necessary constraints. It might require an understanding of clock strategy, which might help in defining the root pin. Therefore, the recommendation is not to use the auto constraint file blindly, but to create your own file after understanding the clock strategy. All clock group statements must be specified before any clock specification. Clock grouping is done to ensure that the maximum skew between their sinks does not exceed the maximum skew time specified in the clock tree specification file. It is crucial for physical designers to understand the clocking from a design perspective. The physical designer should have a clear understanding about modes and about how various clock domains interact. In the case of multiple clock domain designs, it is crucial to create a worst or merged-mode clocking constraint, which can cover the maximum timing-critical paths. The idea must be to create a clock tree that is optimized for all the modes. The starting point of a clock spec is to identify the appropriate clock buffers. The libraries contain a range of clock net buffers and inverters that are designed to have nearly matching rise and fall signal behavior. Such behavior helps the generation of balanced clock circuitry. The cells also have much finer step in drive strengths compared to regular buffers and inverters. Additionally, the clock net buffers are designed such that the input capacitance of each drive strength version is nearly identical. This offers the possibility to exchange cells in a clock circuit to tune the drive strength without affecting the loading of the net connected to the input of the cell and affecting the overall clock tree performance. Usages of clock inverters are preferred over clock buffers since clock inverters have the ability to re-generate the edges. Clock inverters maintain the duty cycle better, which is crucial for half-cycle paths. Clock planning has to be crosstalk aware. Having crosstalk on the clock path impacts setup paths twice as severely as having cross talk on the data path because it’s ed for in both capture and launch delay computation. Crosstalk in the common clock path cancels out the impact during hold time analysis, since analysis is done on the same edge, while for setup it’s ed twice. Therefore, it’s extremely important to lay out the top-level clock tree with the least amount of crosstalk. This will directly benefit the block level setup timing closure. One of the factors that PD engineers should consider is to budget the top-level crosstalk while closing the timing at the block level. One should be pessimistic and realistic so as not to be surprised when block-level timings are seen at the top-level environment. So crosstalk aware clock planning is challenging in 40nm and below. Usually the critical path data optimization is already exhausted at multiple levels from physical synthesis to physical optimization. When it comes to crosstalk fixing, it is usually solved through more custom efforts. The default flow does not help much. Usually there is not much room left for routing, due to floor plan optimization and the higher pin density inherited from the RTL complexity. All these factors make crosstalk optimization a challenging job. Some of the key points to plan while synthesizing and routing clocks are: ●
● ●
Using max distance along with slew constraints can help in building an SI free clock network without shielding. Use timing aware downsizing of aggressor net driver to fix SI issues. Use max distance constrains wisely. Though the number of clock buffers used are more in the tighter Max distance constraints, the clock latency and skew comes out to be much lesser.
●
Change routing topology of aggressor and/or victim net segments.
Relying on the tool to achieve a desired clock tree result may not be a good idea. The physical designer can understand and specify the requirement much better. Below is an example where one can see the divergence reduction going from the tool-based CTS approach to semi-auto (custom) CTS approach. Tool-based auto CTS approach Tool-based auto CTS approach
Clock routing to Block1
Clock routing Block2
• ● ● ● ● ● ●
• •
Observations on tool- based auto CTS There is not much common point from where the tool diverges between the interacting blocks. Minimum common point between interacting blocks leads to divergence. More divergence => more skew Total divergence distance in block1 = 6106+2402=8508u Total divergence distance in block2 = 2x3060+3905= 10025u Skew due to divergence block1-to-block2 = 8508xderate -10025xderate block2-to-block1=10025xderate - 8508xderate
Semi-custom CTS approach The following guidelines are adopted: ● ●
●
●
Manually add anchor points (clock buffers) at the desired location in the layout. Detach the clock sinks from the source (pll) and attach them to the desired anchor point. This requires an ECO to be done in the netlist. The clock path from source (pll) to block1 spans an approximate distance of 19mm out of which the divergence is only for 3100um (6 buffers). Anchor points are inserted from pll until the maximum common point between interacting blocks from a floorplan placement perspective.
Clock tree after semi-auto CTS approach
• ● ●
Advantages: Designer dictates the diverging point between the interacting blocks/sinks through ECOs. Divergence is optimal.
One can also consider creating a feed-through path through a block. Example below:
Reduced clock divergence between the blocks
Timing Constraints, Budgeting Constraints drive the chip implementation, and the implementation of a chip is verified against required constraints. So, one can say constraints are a bounding box of SoC implementation. A definition of constraints must start at a very early phase of RTL development. The starting point must be the understanding of various I/O specifications and data sheets, and translating them into an easily understandable spreadsheet in which SDC should be coded. Half the job is done if the clocks are correctly defined, in case of multiple complex clock design. Constraints are typically a refinement process. It’s important that constraints are coded from a full chip prospective. For large hierarchical chips, the crucial part is translating the constraints from top to bottom (i.e. chip level to sub-chip level). It is also equally important to merge sub-chip level constraints at the full chip level. For example, bringing a sub-chip exception to chip level. This is where the budgeting methodology plays a key role. The designer must be able to accurately estimate the top-level interconnect delay based on the delay estimation exercise. The data from the delay estimation can be utilized to devise the block level I/O constraints in the early phase. The basic idea of accurate budgeting is to reduce the STA iteration between blocks to the top. The block-level constraints should be complete and accurate enough so that once a block is timing closed at the block level, it should remain timing closed once it’s seen in the full chip prospective. In the full chip context, there are two kinds of interactions that need to be understood thoroughly before executing the budgeting, one block-to-block and second block-to-chip I/O. These require a clear understanding of some important parameters, like pad delays, interconnect delays, clock propagation, data propagation, clock-data skew and the slack margin parameters (i.e. the available and setup/hold, max/min delay in the interface data sheet). In fact, before thinking about budgeting, the STA engineers must come up with the STA strategy for the full chip. Hierarchical STA is the way to go for large ASICs. The STA engineer must come up with a strategy for timing and verification of the chip at various stages. In the initial phases it would be the model based STA. QTM, ETM or ILM models can be used. One can come up with a mixed model approach depending upon the RTL and PD stage of each of the blocks. For the large and complex ASIC, it is very important that full-chip timing is continuously seen from the early RTL development stage so that the PD and STA engineers get a good hold on the chip interface and block-to-block interconnect timings.
It’s worth spending time in cleaning up the constraints at the pre-layout stage of physical design. Constraints or timing analysis should drive the PD implementation because eventually the quality of PD implementation is validated against the constraints. Design closure and signoff At the design closure stage PD engineers primarily and signoff the design for timing, power and foundry rules. However, the strategy for signoff must be chalked out well in advance in the PD implementation stage. The designer must have clarity on the operating condition of the chip and create the power and timing verification environment accordingly. It could be wise to look into it before the PD implementation is in full swing. Timing closure Timing closure Another important aspect is to define the signoff corners. It is worth taking a look at reducing the timing closure corner. The application of the chip should be taken into consideration while coming up with the various permutations and combinations of STA corners. One should avoid overdeg for non-existing application corners. Every additional STA corner has a cost of efforts and iteration time. Defining the total number of functional and DFT modes right at the beginning of the PD implementation stage is crucial. It has to be thoroughly verified that the STA modes of the chip are optimum and cover everything required for timing verification, keeping all the defined applications and the environment in mind. STA, DFT and PD engineers must spend time to craft an overall chip timing verification strategy. The mode coded in the SDC for PD implementation has to be comprehensive enough to cover all the modes. So, the key is rightfully merging the modes for PD implementation as well as for timing verification. Signal integrity has been one of the key challenges in complex SoC design at technology nodes like 40nm. Coupling capacitance becomes proportionally larger at these geometries. Interconnect delays become bottlenecks for performance improvement. Large-scale integration of systems into silicon brings many long interconnects between the blocks. Signal arrival times at the destination dynamically change on these long interconnects, as a result of large associated coupling capacitance. The induced delay (positive or negative) is called crosstalk delay. Crosstalk delay is hard to predict due to its dynamic nature, and if not taken care of it will adversely affect the performance of the device and may lead to device failure. Some of the traditional timing closure/design budgeting techniques become ineffective while closing timing for crosstalk induced delay. The challenges are realistic analysis, avoiding pessimism, careful isolation of valid violations, and determining a methodology for efficient fixes. Some of the key points to drive crosstalk-aware implementation and analysis are: ●
● ● ● ● ● ● ●
for larger die area for more buffers, less net congestion, more shielding, etc. so that crosstalk effects are mitigated. Add a couple of weeks in the schedule just for crosstalk cleanup. Start the process early. Make sure the crosstalk settings are realistic, and have a dependable crosstalk delay analysis flow. Ensure slew (transition) control from the beginning. Shield noisy macros like analog cells. Shield long busses by routing ground (VSS) stripes beside them. Macros like DPLLs, DLLs, DCDLs and RAMs should be placed near the blocks that use them.
●
● ● ●
● ●
Otherwise, long outputs running from such macros would be disastrous from a crosstalk perspective. Create placement and routing blockages around some of these skew-sensitive macros to avoid possible crosstalk. Interleave address and data busses with each other during port placement of the internal blocks. It is less likely that the address bits and data bits change their value at the same time. Use ILM-like models for top-level analysis. Attack the root cause to reduce the number of iterations. Control various parameters of the flow to get the results for the right analysis. For example, control the number of aggressors and scaling factors, clock grouping, etc. to perform accurate analysis. Post-process the reports efficiently to filter out the real violations. Derive efficient methods to fix the violations with minimal effect on other domains.
The timing closure engineers should be cognizant of the PD process and its impact on timing. So, the timing closure should be a continual process with the PD closure. The impact of chip finishing like DFM is another important aspect that the timing engineer should not leave until the end. The metal fill, the redundant via, etc. can impact the ground cap and the net resistance, and therefore, the crosstalk. Hence, the DFM process and its impact on timing should be assessed well ahead of the tape-out phase. Power analysis for signoff Rail analysis (i.e. the static and dynamic IR) is usually done once the design has run through PD once. However, this data could be very important in validating the power plan and the floorplan itself. Accuracy of analysis is the key. The designer should be careful in choosing the correct corner, while generating the power analysis data for rail analysis. Knowing where and how to apply power rail analysis can save a great deal of time in power planning and verification. For today’s designs, both static and dynamic analysis should be utilized from initial floorplanning (power planning) through sign-off. Some important points to consider when determining a methodology for power rail analysis include: ● ● ●
● ● ●
Use static analysis to generate robust power rails (widths, vias, etc.). Use dynamic analysis to optimize the insertion of de-coupling capacitance. In the case of power gate design, use static analysis to optimize power switch sizes to minimize IR drop. Use dynamic power-up analysis to optimize power switch sizes to control power-up ramp time. Use both static and dynamic analysis early and late in the design flow. Establish IR drop limits based on understanding how IR drop can affect timing. Try to optimize for decoupling capacitors early in the flow, since late optimization for de-caps can lead to major re-work.
Another important part of Rail analysis is EM analysis. Designers must ensure that the design meets the DC current density limits of the power mesh and the power rails before finalizing the power plan. It becomes more critical if the design is packed with memories. The power tapping to the memories needs be sufficient to sustain the current densities. Signal EM is also becoming crucial for high-frequency designs at technology nodes below 40nm. Signal EM can impact the clock network. However, if proper care is taken, such as NDR for clock network, then chances of signal EM violation can be reduced. Before analyzing the EM violation, the designer must validate the DC and AC current limits in the tech file and validate them through DRM.
The designer should also be aware of the operating conditions while generating the current density models for EM analysis. Typically EM is signed off at 110C, and tech file and DRM provide the table for 110C. In summary, there is a lot at stake when the power requirement is established and power is estimated. Hence, it’s extremely important to have as much accurate data as possible early on in the process in order to make the decisions. Because there is so much at stake in of providing the direction for ASIC development, it’s absolutely necessary to make the decision in the very early stages of ASIC development. It is also advisable to do a complete rail analysis (i.e static, dynamic IR and EM analysis) well before the tape-out phase. If touching a lot of metal layers to fix rail violations post timing closure occurs, it could cause schedule overhead. Summary Large SoCs in a smaller geometry technology increase the design complexity multifold. The traditional waterfall approach of SoC implementation can no longer guarantee a predictable schedule and reliable silicon. Upfront and thorough thinking, in every aspect of SoC development, is needed for today’s SoC designs. Thorough planning is required from an early stage, and all functions must work cohesively and in parallel with each other. More reviews and more involvement of teams across the functions can reduce the risk of mistakes. Additionally, with higher constraints in the higher technology nodes (mainly variation), and with the cost of manufacturing, it is important to place more hooks and checks prior to signoff, for successful single silicon. More about Bhupesh Dasila Also see: ●
● ●
Design planning for large SoC implementation at 40nm: Guaranteeing predictable schedule and first- silicon success Design planning for large SoC implementation at 40nm - Part 2 Kilo develops embedded multi-time programmable non-volatile memory in 40nm logic CMOS