Verification Tools Help PHS Transceiver Take Silicon Form

Software-verification tools support the constant evolution and improvement of many integrated circuits (ICs). For example, these tools have enabled designers to shrink the size and cost of personal-handy-phone-system (PHS) ICs while increasing the level of integration. Typically, PHS transceivers are implemented as a multi-RF chip solution. To satisfy the growing demand for lower-cost PHS handsets in China and other countries, however, engineers from Microlinear and Cadence have applied software-verification tools to develop a single-chip PHS transceiver.

The block diagram of the singlechip transceiver is shown in Fig. 1. Clearly, design tools were needed to minimize the complexity that is derived from having a large number of components and digital gates. The transceiver incorporates a low-IF receiver architecture demonstrating ?105 dBm sensitivity, a transmitter with an on-chip +21-dBm output PA, and a single fractional-N synthesizer achieving <30-µs lock time for fast handoff capability. The architecture's high level of integration and complexity resulted in a 2X to 3X increase in digital-logic content—from approximately 3 kgates to nearly 13 kgates. It also resulted in significantly larger die size. The design therefore required the adoption of new analysis tools as well as automated methodologies for synthesizing, placing, and routing the digital logic.

The individual parts of this transceiver-IC also are very complex. The receiver subsystem, for example, consists of an RF I/Q downconverter. The RFdownconverter section, in turn, consists of a differential low-noise amplifier (LNA) and an image-reject mixer. The required sensitivity is ?100 dBm and blockers can be as high as 50 dBm. As a result, the RF front end requires a minimum instantaneous linear dynamic range in excess of 53 dB. The IF upconverter subsystem consists of a 450-kHz bandpass filter (BPF), IF amplifier, upconverter, 1.2-MHz BPF, second IF buffer, and polyphase network.

Transmitter Design
The transmit-circuit block diagram is also shown in Fig. 1. The transmitter is designed to accept DC-coupled, quadrature PHS baseband inputs. The baseband signals are buffered and applied to the quadrature modulator. There, they are combined and directly converted to the 1.9-GHz RF. With the help of design tools, both the PA and I/Q modulator eliminate the traditional filter that is seen between the modulators and PA in PHS and many other transmit designs.

To achieve high linearity, the modulator was designed with a topology that is similar to the one used in the RX. The TX chain includes a digitally controlled, 32-dB programmable-gain amplifier (PGA). This PGA is used to shape ramp profiles during transmission power-up and power-down cycles. It also functions as a digital control for setting the output power. The integrated PA consists of two stages of Class A/B amplifier with external inductors for matching and DC supply.

PLL Design
To meet the challenging phase-noise and switching-time requirements of the PHS standard, a highly configurable fractional-N architecture was selected. Fractional-N synthesizers break the traditional tight relationship between PLL bandwidth, resolution, and reference spur level. A fractional-N synthesizer provides the opportunity to have extremely fine resolution while simultaneously keeping a large loop bandwidth. Compared to their Integer-N counterparts, fractional-N synthesizers have a smaller settling time and lower closein phase noise. Yet these features come at the expense of higher out-of-band noise and spurs.

The digital-control-logic block controls the many programmable blocks within the design. Registers are provided on the chip to allow the control of many of the parameters within the PLL, receive, and transmit blocks. For example, the gain of the receive path can be controlled at several stages by setting the gain of the LNA or one of several VGA stages. The control logic does all of the required power management for the IC.

Methodology Overview
In the design of mixed-signal ICs, analog functions dominate the die area. As a result, analog IC designers often design the digital portions as well. A common practice is to create a small standard-cell library (e.g., about 40 cells) and place logic symbols on a schematic.-The designer will then do a Verilog gate-level or transistor-level simulation. Because of the small gate counts (<10 kgates), the layout is done by hand instead of using industry-standard digital-block place-and-route tools.

This type of manual logic design and layout process can still be very time consuming, however. For a 5.8-GHz digital-cordless-telephone chip that used this methodology, the design and layout of the logic blocks for the PLL synthesizer and control logic took over four man-months of effort. All of the logic, including a scan chain for ATE, was designed manually. All together, these blocks involved about 3000 two-input, NAND-gate logic equivalents.

Once the gate counts begin to exceed 10 kgates, as they did on the current design, manual design and layout techniques become impractical. Another challenge is that most digital designers are accustomed to completing designs comprising several millions of gates by using highly advanced, specialized, and expensive digital-logic design tools. Yet these tools require a great deal of time/effort, which is not practical for analog designers to invest. From a financial and technology perspective, these digital tools are probably seen as "overkill" for designs between 10 and 20 kgates.

To accommodate the digital-logic cells and gates, an analog place-androute methodology was adopted. Conceptually, the adaptation consisted of adding a logic-synthesis step at the front end of an already well-established analog implementation methodology. The netlist was then imported into the Cadence Virtuoso environment. Virtuoso's analog place-androute tools were applied to a standard-cell design. The details of this methodology are shown in Fig. 2.

The specification of the digital-control logic is first captured in RTL Verilog. Next, the Verilog is functionally simulated in NCSim to verify that all functional specifications are met. Once RTL Verilog development is completed,the code is synthesized using standard logic-synthesis tools. Appropriate input and output timing constraints along with estimated output loading are used during synthesis. Clock-tree and scan-chain synthesis also are performed. Afterward, the synthesized netlist functionalityis compared with the original RTL Verilog using standard-logic equivalencychecking tools.

A gate-level netlist is created at the chip level. It includes modules for the analog blocks. Automatic-test-patterngeneration (ATPG) tools, such as Cadence Encounter Test, use this netlist to create ATPG vectors. In addition, a testbench enables simulation in NCSim.

Page Title

The synthesized netlist is modified to include power and ground pins in the DFII standard-cell symbol views. The netlist is subsequently imported into the DFII schematic database. Mask designers can then use connectivity-or netlist-driven physical design tools for implementing the logic. The DFII techfile also must be modified to include process-design rules (metal spacing, minimum area, etc.) and via definitions. In addition, layout translation and placement rules must be developed. Optionally, do-files may be set up to handle engineering change orders (ECOs).

The imported schematic is used to generate a layout view with connectivity. First, pins are manually placed to support the chip floorplan. After power planning, automated placement is performed using a custom-logic placer (in this case, Virtuoso Custom Placer). The placement may need to be manually adjusted to minimize net lengths on critical nets. Placement is an iterative process that is run until an optimal placement is achieved. The design is then routed using a custom router, such as Virtuoso Chip Assembly Router (VCAR).

Physical verification is performed using Assura DRC and LVS. Designrule violations can be fixed either by iterating through the place-and-route process or manually editing the layout. Assura RCX is run on the clean layout to generate a SPEF file. This file contains parasitic information for the interconnect nets. To perform a static-timing analysis, the logic-synthesis tool uses the SPEF file along with the gate-level netlist.

This design flow realizes a complete digital design and implementation process from RTL to GDSII with checks for functionality, logic equivalency, and timing. The flow has a low incremental cost to analog designers who are already using a custom IC environment like the Virtuoso custom design platform. It takes advantage of analog place-and-route capabilities to perform digital implementation. In addition, the flow is easy for the analog designers to pick up. It is run from the Virtuoso platform with which they are already familiar.

In this case, training for the flow and tool use was provided through formal training as well as on-site application-engineering (AE) support. Because the design was not timing-critical, the team was able to use the analog placeand-route methodology. If the design were timing-driven, the designers would have migrated to a digital-centric methodology. The analog place-and-route tools are effective for up to 20 to 30 kgates. For designs larger than that, a digitalcentric methodology like the Virtuoso Digital Implementation Option is recommended.

Implementation Results
Microlinear and Cadence engineers used this approach to create the digital blocks for a 1900-MHz PHS transceiver chip. The chip was manufactured in the Jazz Semiconductor 0.35-µm CMOS SiGe60 process. The chip was done in two versions (Fig. 3). The first version has approximately 10,000 two-input NAND-gate equivalents. The second version has 12.7K two-input NAND equivalents. All automated routing was done with three layers of metal. A fourth layer is available, but it is 3 microns thick. Because Metal 1 is already present in the standard blocks, all routing was effectively done in two layers of metal.

The initial version of the chip had all of the logic partitioned into 11 blocks (Table 1a). With the new, automated implementation methodology, the silicon area was better utilized. Another benefit of the new methodology was that it enabled better floorplanning. All of the functionality was merged into only two blocks. The results are shown in Table 1b. Figure 3 shows the digital control logic on the PHS transceiver die.

One of the advantages of the automated digital-implementation methodology was a dramatic improvement in design productivity. Ultimately, productivity went from four months (10 kgates) down to three days (12.7 kgates). This decrease resulted in a much lower cost of making changes to the design. As a result, the logic functionality did not have to be frozen until three weeks before tapeout. Using the previous manual implementation methodology, such an approach would not have been possible.

Compared to the previous methodology, the new digital implementation methodology eventually yielded a nearly 50x improvement in design productivity (gates/day). The quality of the results in terms of area was good. In addition, it was possible to take advantage of automatic scan-chain insertion. Because most of the logic was done in this manner, the interface between the digital and analog portions was better controlled. This interface included problems like level-shifting signals and not creating floating gates. Changes to the logic functionality could then occur later in the design process with less overall impact. In the second version, overall area utilization improved because the design did not have to be partitioned into small pieces for the router.

The tables show the results of the new flow. The area in square microns represents a bounding box for the prBound layer in DFII. The wasted area is the portion of the overall bounding box that is not filled with standard blocks. Because the layout consists of rows— and not all rows are the same length— the wasted area is inherent to standard block layout. It does not matter if it is done manually or with place and route.

In some cases, the area was not as important and little effort was put into manually shrinking the placement results. One block—Block 1 in the new chip—proved very difficult to route. It required the rearrangement of the blocks, shifting open space to the congested-area. It routed in the end, but with very dense metal layout.

For several years, voltage (IR)-drop and electromigration (EM) analysis have been recognized as growing concerns in high-complexity analog design. One driving factor has been the increasinglevels of integration, resulting in largerdie sizes. The physical effects associated with more advanced semiconductor processes also are feeding those concerns.

When the IR voltage drops across IC interconnects are excessive, " headroom" decreases to a point at which the circuit may no longer work. This does not solely apply to the information-bearing signals being processed (e.g., the analog/RF). It also applies to control signals. If the input to a voltage regulator ends up lower than expected, the device might cease to regulate. This lack of regulation will compromise the performance of several portions of the IC, which might otherwise be working fine.

Page Title

Electromigration problems are even more insidious. Often, these problems don't surface until after the IC has been tested in full operation. In the best cases, these issues are discovered during postmanufacturing test. In the worst cases, they are revealed when the end user returns his or her faulty product to the store where it was bought.

Traditionally, designers have tried to safeguard against these problems by using manual/visual techniques for review/inspection. Such techniques are tedious and error-prone. They run additional simulations or design circuitry with sufficient "guardbands" to allow for these factors. A recent, more effective approach is currently being adopted: Leverage the extensive simulation data that is gathered during the analog design process. Then analyze that data in new ways to uncover potential problems related to IR-drop, ground-bounce, power-rail electromigration, and signal electromigration. The Cadence Virtuoso Analog Design Environment had available two optional tool packages (Analog VoltageStorm and Analog ElectronStorm) that can be used for this purpose.

The Virtuoso Analog VoltageStorm and Analog ElectronStorm options use the existing Virtuoso infrastructure to perform IR drop and EM analysis. The flow does not require any modifications to Assura DRC/LVS/RCX decks.

In addition, it uses the existing testbenches. The only input file that needs to be created is an "emDataFile.txt" to specify the current density limits for a given process.

The typical flow begins with a DRC and LVS clean layout (Fig. 4). The next step is to create an extracted netlist with parasitic resistance for all nets. Optionally, parasitic capacitors for dynamic IR-drop analysis can be included. The extracted view is used along with the existing testbenches to run either a DC or transient analysis. Either circuit (Virtuoso Spectre Circuit Simulator) or FastSPICE (Virtuoso Ultra-Sim Full-chip Simulator) simulation engines can be used to perform the simulations.

The FastSPICE option allows analysis on large circuits including parasites. The simulation results are postprocessed by the Virtuoso Analog VoltageStorm Option to generate an "IR-Drop" map on the layout. This map shows areas of voltage drops according to pre-defined ranges. Transient-analysis results give either the average or worst-case IR drop for the entire simulation period. The Virtuoso Analog VoltageStorm Option uses the simulation results and an "emDataFile.txt" to generate an " EMFailure" map on the layout.

Figure 5a shows an "IR-Drop" map of the layout with pre-defined ranges and the associated color codes. The VDD in the example shows an IR drop in the range of 2.572 to 3.085 mV. Figure 5b shows the nets that have an EM issue. In this case, net "re468" has failed the current density checks by 214.4 percent. To satisfy the current density limits, the net width must be increased from the existing (measured) width of 0.23 microns to the new (minimum) width of 0.723 microns, which also is shown here. If the width of this net segment was not increased, the net would be prone to electromigration failure. Faulty silicon would result.

The fully integrated PHS transceiver-that is discussed here has an integrated PA and fast-locking PLL/VCO with greatly reduced bill-ofmaterials (BOM) complexity and cost. That large reduction in complexity and BOM count for a PHS phone was achieved by the integration of critical/costly RF and IF functional blocks, such as the PA, PLL/VCO, IF filters, and other traditional functions within a single 0.35-µm SiGe-BiCMOS IC. The transceiver IC presented is designed to satisfy the continued growth in demand for lower-cost PHS handsets in China and other countries, which demand a very low-cost wireless-handset solution.