FEATURE EDA A QUESTION OF FINDING THE RIGHT BALANCE
Ravi Andrew and Madhuparna Datta at Cadence Design Systems explore how the performance boundaries of ARM Cortex-M processors are being pushed by modern design techniques and the impact this is having on the future of embedded design
O
ne of the toughest challenges in the implementation of any processors is
balancing the need for the highest performance with the conflicting demands for lowest possible power and area. Inevitably, there is a trade-off between power, performance, and area (PPA). An exploration of this has been conducted on the ARM Cortex-M7 processor using state-of-the-art tools from Cadence. This exercise had an objective of addressing
two simultaneous challenges: 1. Reach, as fast as possible, a performance level with optimal power (AFAP)
2. Reduce power to the minimum for a lower frequency scenario (Min Power)
DYNAMIC POWER CONTRIBUTORS: There are three key reasons for the power dissipation. First, there’s the presence of a large number of devices and wires integrated on a big chip that results in an overall increase in the total capacitance. Second, there are high clock frequencies. Third, there’s an inefficient use of gates. The expression for dynamic power is: Pdynamic = αCVf/2…………… (1) Five key components of dynamic power consumption in a digital IC design are: • Standard cell logic and local wiring • Global interconnect (mainly busses, inter-modular routing, and other control)
K
• Global clock distribution (drivers + Interconnect + sequential elements)
•Memory (on-chip caches) — this is constant in our case
• I/Os (drivers + off-chip capacitive loads) — this is constant in our case
TIMING CLOSURE: One fundamental issue of timing closure is the modelling of physical overcrowding. For a multi-layer routing technology, spreading of components actually increases the wire length and demands more routing space. Accurately predicting the detail routed
signal integrity (SI) effects, before the detail routing and its impact to timing, is of key interest. This is because reasonable wrong prediction of timing before the detail route would create timing changes after the routing is done. In addition, if we can reduce the wire lengths and make good judgment based on the timing profiles, this would further reduce power.
14 FEBRUARY 2015 | ELECTRONICS Standard cell placement plays a vital role.
If the placement is done right, it will eventually pay off in terms of better Quality of Results (QoR) and wirelength reduction. This is the core principle behind Cadence’s “GigaPlace” placement engine, which helps place the cells in a timing-driven mode by building up the slack profile of the paths and performing the placement adjustments based on these timing slacks. The company advises it has seen good improvements on the overall wirelength and Total Negative Slack (TNS) with “GigaPlace.” The above two helped push the frequency as well as reduce the power. But, there were still more opportunities available to further benefit the frequency and dynamic power targets.
Figure 2:
Normalised power charts at 200MHz
During this exercise, it was observed that the floorplan size was bigger than needed and the cell placement density was uniform. These two aspects lead to spreading out of cells, resulting in longer wirelength and higher clock latencies. The engineers at Cadence changed the floorplan to keep the standard cell densities above 75 percent.
Figure 1:
Normalised power numbers at 400MHz
IN-ROUTE OPTIMISATION “In-route optimisation” enables an accurate view of timing/SI and makes bigger optimisation moves without disrupting the routes. This technology utilises an internal extraction engine for more effective RC modelling. The timing QoR improvement observed after post- route optimisation was significant at the expense of a slight runtime increase (currently observed at only 2%). This optimisation technology pushed the design to achieve the targeted frequency. In general EDA tools make the trade-off
to prioritise timing. However, opportunities exist to reduce area and gate capacitance by swapping cells to lower gate cap cells and by reducing the wirelength. To address the dynamic power reduction in the design, further experiments were conducted to examine the above aspects. In the first set of experiments, two main
tool features were used: “dynamic power optimisation engine” along with the “area reclaim” feature in the post-route stage. These options helped save 5% of dynamic power @400MHz and enabled the company to nearly reach halfway through the power target. The next set of experiments was related to design changes where flop sizes were downsized to a minimum at pre-cts opt and the remaining flops of higher drive strengths were set to “don’t use”. By using these latest tool technologies
from Cadence and its design techniques, the company was able to achieve 10 percent better frequency and reduce the dynamic power by more than 10 percent. The joint ARM/Cadence work started
with addressing challenges at two points /scenarios on the PPA curve: 1. Frequency focus with optimal power (400MHz)
2. Lowest power at reduced frequency (200MHz) For scenario #1, out of box Cadence
Encounter platform 14.1 allowed the company to reach 400MHz. With the use of PowerOpt technology, available in Encounter Digital Implementation System 14.1, Cadence was able to reduce power to an optimal number. For scenario #2, additional use of GigaPlace technology and inherently better SI management made much higher power reduction at 200MHz possible. It was possible to show 38 percent dynamic power reduction (for standard cells) going from 400MHz – 13.2-based run to 200MHz – 14.2 best power recipe run.
Cadence Design Systems
www.cadence.com +49 (0) 89 4563 1726
Enter 252 / ELECTRONICS
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48