Probabilistic modeling of noise transfer characteristics in digital circuits

Device scaling, the driving force of CMOS technology, led to continuous decrease in the energy level representing logic states. The resulting small noise margins in combination with increasing problems regarding the supply voltage stability and process variability creates a design conflict between efficiency and reliability. This conflict is expected to rise more in future technologies. Current research approaches on fault-tolerance architectures and countermeasures at circuit level, unfortunately, cause a significant area and energy penalty without guaranteeing absence of errors. To overcome this problem, it seems to be attractive to tolerate bit errors at circuit level and employ error handling methods at higher system levels. To do this, an estimate of the bit error rate (BER) at circuit level is necessary. Due to the size of the circuits, Monte Carlo simulation suffers from impractical runtimes. Therefore the needed modeling scheme is proposed. The model allows a probabilistic estimation of error rates at circuit level taking into account statistical effects ranging from supply noise and electromagnetic coupling to process variability within reasonable runtimes.


Introduction
VLSI technology has adhered to Moore's Law by aggressive device dimensions scale down.Simultaneously supply voltages are being decreased.As shown in Fig. 1, the resulting design space is limited by fundamental borders due to quantum mechanical and thermodynamics effects.Current 40-nm CMOS technology is shown by the efficiency curve of a NAND gate with varying supply voltages.To further increase efficiency in future VLSI technologies we are moving closer to the fundamental limits.At the same time the increasing number of devices integrated into circuits severely Correspondence to: T. G. Noll (tgn@eecs.rwth-aachen.de)strains the supply networks, making it increasingly challenging to generate low noise supply voltages.Furthermore effects such as process variability do not scale with technology.As a result, timing and switching behavior of logic gates is increasingly susceptible to transient faults.As predicted by the ITRS roadmap (ITRS, 2009), future VLSI technology will therefore face reliability as a new design challenge.
Fault tolerance approaches handling reliability problems date back to the 1950s, when John von Neumann proposed a multiplexing and redundancy scheme for reliable circuits (Neumann, 1956).In more recent approaches von Neumann's concepts have been adapted and enhanced in several ways, e.g.Blaauw et al. (2008); Hegde and Shanbhag (1999).The considerable cost of both area and/or power consumption is common in all these approaches.Due to the fact that parts of the circuitry remain unprotected complete absence of errors cannot be guaranteed.Consequently there is a design conflict between energy efficiency and reliability.
As an alternative approach, it is promising to tolerate bit errors at circuit level and deal with them at higher system levels.Making use of error handling blocks -e.g.channel decoders -already integrated into most signal processing systems, the cost for increased system reliability can be reduced, thus increasing efficiency.An accurate estimation of the stored bit error rate (BER) (Noll, 2010) generated by hardware because of transient faults is a center piece in this approach to make sure system failure is avoided.Monte Carlo simulation is often used to model statistical effects such as noise and variability; but, due to the large device count in circuits and the number of effects to be taken into account, it unfortunately requires unacceptable runtimes.
In case of the effects of variability on timing, statistical static timing analysis (SSTA) (Li et al., 2009) was proved to be very efficient.Because timing errors represent a single class of transient errors, SSTA is not sufficient to estimate the BER at circuit level.Therefore this work proposes a novel statistical modeling technique that takes into account the effects of noise and variability on the error rate.
future VLSI technology will therefore face reliability as a new design challenge.in several ways, e.g.[Blaauw] [Hegde].The considerable cost of both area and/or deal with them at higher system levels.
Making use of error handling blocks -e.g.channel decoders -already integrated into most signal processing systems, the cost for increased system reliability can be reduced, thus increasing efficiency.An accurate estimation of the stored bit error rate (BER) [Noll] generated by hardware because of transient faults is a center piece in this approach to make sure system failure is avoided.Monte Carlo simulation is often used to model statistical effects such as noise and variability; but, due to the large device count in circuits and the number of effects to be taken into account, it unfortunately requires unacceptable runtimes.
In case of the effects of variability on timing, statistical static timing analysis (SSTA) [Li] was proved to be very efficient.Because timing errors represent a single class of transient errors, SSTA is not sufficient to estimate the BER at circuit level.Therefore this work proposes a novel Fig. 1.Design space: power consumption vs. gate delay (Waser, 2005).

Modeling concept
Although the sources of transient faults in digital circuits are manifold, including electromagnetic coupling as well as supply noise and particle strikes, their effects are very similar.Transient faults can affect timing and voltage levels of signals.These effects in turn can induce timing errors, invalid signals or bit-flips.A model addressing transient faults in general has to take into account all of these effects to accurately estimate the resulting bit error rate.Furthermore propagation of errors along the different logic gates in a circuit needs to be considered.
Since problem complexity increases with more effects to be considered, two approaches are integrated into the modeling approach to simplify the model.Making use of the division of logic circuits into pipeline stages the model can be partitioned accordingly.Instead of simulation, the logic gates in a pipeline stage are rather modeled statistically.Input and output voltages of the logic gate are described by their probability density functions (PDF) instead of signal waveforms (see Fig. 2) for further complexity reduction.
Errors propagate from one pipeline stage to the next only if an incorrect value is stored in the latch separating the two pipeline stages.Then only pipeline stages need to considered, each consisting of hundreds to thousands of logic gates, as opposed to considering millions of gates in complete system.Therefore the PDF of input voltage values at the latch during sampling time needs to be known, which is the same as the PDF of the output voltage of the last logic gate in the pipeline stage.Defining voltage intervals for correct, uncertain and incorrect values, the number of bit errors passing The effects of variability on switching times can be evaluated efficiently with the latch can be estimated.As noise and variability influence voltage levels and timing as well, the PDF of the delay along the pipeline stage have to be considered.The effects of variability on switching times can be evaluated efficiently with SSTA methods.Moreover, for a comprehensive model allowing estimation of the stored BER another model is required to gain knowledge of the voltage level PDF of the individual logic gates in a pipeline stage.

Transfer characteristics
The basic principle used in the statistical modeling of logic gates is to project the PDF of the input voltages to a PDF of the output voltage of the gate.To achieve this, each input voltage PDF is partitioned into discrete intervals V a < V in < V b with an associated probability P (V a < V in < V b ).Considering the output voltages V out,a and V out,b associated with the edges of an interval V a and V b , an appropriate output voltage interval is identified to which the probability is mapped.For an inverter, as an example, the mapping is shown in Fig. 3 and represented by (1) Where ) is the probability of the output voltage being in the range between V out,a and V out,b .In Fig. 3 the input voltage PDF shown in the lower right is projected to an output PDF on the left using the transfer characteristics of the inverter.Thus, the output voltage PDF p(V out ) can be generated by repeating this mapping for all input voltage intervals.In general, for gates with multiple input ports, the combined probabilities of the input intervals have to be used as for two input ports in (2) Equation ( 1) is here extended to include input voltages V in,0 and V in,1 .With both voltages being in defined intervals, the  edges of these input intervals are used to calculate the edges of the appropriate interval of the output voltage.
This basic modeling concept does not yet consider adverse effects like supply noise or variability.To account for these, a set of transfer curves is employed instead of a single nominal one.The individual characteristics of these sets are generated by Monte Carlo simulation of the logic gate influenced by the effects to be modeled.The resulting set of transfer curves for an inverter under threshold voltage variation is shown in Fig. 4.Each of the characteristics is associated with a combination of threshold voltages for the n-and p-channel transistor.
The probability P (C i ) of the occurrence of a given combination of effects C i is known.Therefore the overall PDF of the output voltage can be derived by estimating the output PDF p(V out |C i ) for each C i ; weighing these with the according probabilities P (C i ) and accumulating the results as in (3) To model a complete pipeline stage the results are taken as input PDF for the subsequent logic gates and the estimate is repeated until reaching the latch.

Conclusion
As predicted by the ITRS roadmap, reliability of VLSI circuits is becoming a serious problem, especially for low power applications, for future VLSI technologies.Reduction of supply voltages, noise on the supply net or electromagnetic coupling and variability severely increase susceptibility of circuits to transient faults.Fault tolerant architectures and combination of threshold voltages for the n-and p-channel transistor.
To model a complete pipeline stage the results are taken as input PDF for the subsequent logic gates and the estimate is repeated until reaching the latch.

Conclusion
As predicted by the ITRS roadmap, reliability of VLSI circuits is becoming a circuits approaches currently developed suffer from a significant overhead in area and power consumption.It thus seems to be attractive to keep efficiency by tolerating bit errors and making use of error handling mechanisms present at higher system levels to cope with them.Knowledge of the BER at the physical level is essential to this approach.A statistical modeling concept is proposed allowing estimation of the stored BER with moderate effort compared to Monte Carlo simulations.
Beyond the basic concept presented in this work research on this subject will have to cover several aspects.Regarding the proposed statistical reliability estimation, the interdependences of input signals of the logic gates in a pipeline stage needs to be considered as well as a statistical description of disturbances due to electromagnetic coupling.Furthermore, to devise a framework for estimation of the impact of physical faults on timing and voltage levels, the actual merging of SSTA and the proposed statistical voltage level modeling methods needs to be established.

Figure 2 .
Figure 2. Modeling of logic gates

Figure 3 .
Figure 3. Statistical modeling of an inverter without noise
Figure 4. Set of transfer curves of an inverter subjected to threshold voltage variation

Fig. 4 .
Fig. 4. Set of transfer curves of an inverter subjected to threshold voltage variation.