Multimedia SOCS are more than a process

19 March 2008

The dominance of software-based algorithms running on DSP and CPU cores in consumer and mass market applications, such as mobile phones and portable multimedia players, drives the demand for ever higher levels of performance from system on chip (SoC) designs.

eDRAM and chip on chip memory

The complex nature of the software, coupled with the growth in data volumes is also fuelling the need for larger, integrated memory storage and wider connectivity (for example, high speed serial and double data rate parallel interfaces) to other system components. Despite this, the portable and battery-powered nature of such designs means that highly integrated, small footprint solutions that operate with ultralow-power consumption are essential. Furthermore, in addition to the technical challenges facing designers, ever shorter time-to-market windows lead to pressure for ‘first time right’ designs, higher levels of design productivity and minimum turnaround time. This raises a question about how designers can address the seemingly contradictory requirements placed on the SoCs destined for new and emerging applications.

The key lies in careful consideration of the technical requirements (processes, memory options, IP cores, and packaging approaches) and the design methodologies and support available to keep turnaround to a minimum.

Focus on process shouldn’t just be to strive for the latest ultra-sub-micron solution; new system-in-package (SiP) technologies are giving designers greater flexibility. This makes it increasingly practical to bring together different dies in the same package to optimise the performance benefits of new technologies with the cost saving, low risk and improved time benefits of proven IP typically associated with more mature
processes.

Sub-micron processes
Clearly, new and emerging ultra-deep submicron processes are helping to maximise SoC performance. However, performance improvements come with an associated increase in power consumption. Increases in dynamic power can be compensated with a more advanced process (featuring a smaller supply voltage and lower parasitic capacities), but thinner oxide layers and smaller transistors will increase leakage currents (standby power).

Where low power and high performance were inherent in transistors, design teams are now required to apply their judgement in the use of low or high threshold voltage transistors, which deliver either low standby power or high performance. It is possible to tune a digital circuit to deliver low power and high performance, by using high threshold transistors where speed is not crucial; while the use of low threshold transistors is permitted in critical paths.

Beyond the physical challenges of fabricating fast and power-efficient transistors on an ultra-deep sub-micron scale, the sheer density of transistors presents further technical barriers. The once simple process of creating a global clock tree is now a thing of the past. Complex clock gating and dynamic frequency/voltage scaling is necessary to remain within ever-tighter power budgets without sacrificing performance. With these factors in mind, Toshiba developed its latest process technology.

Fabricated using the 65nm CMOS process, Toshiba’s TC320 ASIC family brings together copper technology, low-k dielectric, and an aggressive 65nm (50nm drawn gate) solution and supports up to eight levels of copper metal plus one level of aluminium interconnect. The technology offers a number of improvements over the 90nm (TC300) family, such as twice the logic density, a 30 per cent reduction in power per gate, and a 20 per cent reduction in gate delay. Through the development of three types of transistors, with different threshold voltages, the TC320 process offers low power, high-speed and very highspeed transistor libraries.

Table 1 outlines the general specification of the TC320 family. In addition to a multithreshold process capability that allows mixing of logic cells operating at different threshold voltages, the optional 1.0V core voltage – along with other power saving techniques such as conditional clocked low power flip-flops – allows for cuts to dynamic power consumption and leakage current.

Embedded DRAM
Memory is critical to all SoC designs. Embedded DRAM delivers higher performance than external memory, and freed from I/O restrictions, provides high memory bandwidth and lower power consumption. Its use contributes to lower pin count and therefore smaller footprint packages.

Toshiba’s embedded DRAM is based on its trench capacitor technology, which delivers higher memory density than is possible with a conventional planar approach. The eDRAM delivers high memory bandwidth with up to a 256bit wide bus and, when clocked at 350MHz, achieves a maximum data rate of 11.2Gbyte/sec. In addition, trench capacitorbased eDRAM is robust against the SER (soft error rate) effects of cosmic radiation.

For ASIC/SoCs requiring higher densities of integrated memory, or where designers want to make use of lower cost dies based on previous process technologies, Toshiba offers semi-embedded DRAM solutions that use multi-chip packaging technology to provide a system in package (SiP). Classical SiP includes side-by-side and stacked die solutions, but Toshiba has now introduced chip-on-chip (CoC). Within a CoC, a second die (memory) can be directly connected, die face down, via tiny micro-bumps, to the surface of the logic die. This delivers a larger memory capacity with the same high bandwidth from a single, standard package.

Successfully creating complex SoC and SiP designs in less time is helped by proven cell libraries and digital and mixed-signal IP cores. The TC320 family offers synthesis-friendly primitive cells for high performance and low power chip designs, using a multi-threshold process that enables a mixture of both design targets. In addition, the family also offers I/O cells in two forms; narrow width I/O cells for high pin-count designs, and standard width I/O cells for core limited designs.

SiP technology can provide re-used mixedsignal IP on a more mature process technology with high-performance 65nm implementations in the same package.

As well as the demand for embedded DRAM, SRAM also remains essential to overall functionality and performance. Here, the TC320 family offers compilable, density and performance optimised, single and multiport SRAMs with flexible widths and depths.

Lower risk, less time
There are other elements beyond the technology that can further reduce risk, cost, and turnaround time.

Toshiba’s European LSI design and engineering centre (ELDEC) provides local engineering expertise that includes dedicated project management to support customers through every step of the design and implementation process. This local support is combined with a hierarchical design approach and a timing-driven design flow that allows the design team to create sub-blocks in parallel and resolve timing problems at the block level.

During RTL synthesis, accurate models using delay extraction data ensures a closer correlation between the pre-layout and postlayout delays. This accurate delay information is used throughout the design flow, to the extent of optimising the design through gate sizing, repeater insertion and timing driven routing, to achieve quicker timing closure.

Another major issue facing the industry at 90nm and below is the increase in parasitic delay. This is caused by the capacitive and inductive elements inherent with the thinner and denser interconnects that are necessary in modern design. Being able to accurately and reliably extract this parasitic information is paramount in a multi-layer process like TC320. With up to eight copper interconnect layers and an additional aluminium layer, being able to understand and control interconnect delay, while maintaining signal integrity, are key to the success of the TC320 family.

The final part of the design and development equation is ensuring that the SoC or SiP is supplied in packaging optimised for the given application.

A range of packaging options is available, depending on the application. For designs requiring high pin counts (600 to over 2,000), flip-chip BGA packaging, PBGA(FC), offers the highest I/O density and electrical performance available today. PFBGAs with 109 to 265 pins have a package body size no larger than 15mm by 15mm and are optimal for applications requiring minimal form factor. PBGAs with 256 to 868 pins are cost-effective solutions for mid-range I/O pin count requirements.

Toshiba offers multi-layer PBGA(4L) packaging with excellent electrical performance for the 256 to 868 pin range. For price-sensitive applications with low pincount, there is the WCSP (wafer level chip scale package).

EUGEN PFUMFEL is principal engineer, ASIC and foundry business unit, Toshiba Electronics Europe


Contact Details and Archive...

Related Articles...

Most Viewed Articles...

Print this page | E-mail this page