body 12-12-13

Published on December 2016 | Categories: Documents | Downloads: 43 | Comments: 0 | Views: 244
of 74
Download PDF   Embed   Report

Comments

Content

Design of Low Power ALU using Area Efficient Carry Select Adder

CHAPTER 1 OVERV E! O" THE PRO#ECT
1$1 ntroduction
Design of any Low power VLSI circuit with less area and high speed has become a main concern for digital designers. Building low power VLSI systems has emerged as highly in demand because of the fast growing technology in mobile communications and computation. The battery technology does not advance at the same rate as microelectronics technology. There is a limited amount of power available for the mobile systems. So designers are faced with more constraints such as high speed, high throughput, small silicon area, and at the same time, low power consumption. So building low power, high performance adder cells are of great interest !"# $". In the past few decades ago, the electronics industry has been e%periencing an unprecedented spurt in growth, than&s to the use of integrated circuits in computing, telecommunications and consumer electronics. 'e have come a long way from the single transistor era in !($) to the present day *LSI +*ltra Large Scale Integration, systems with more than $- million transistors in a single chip .". /s the performance of processors has increased, the demand for high speed arithmetic bloc&s has also increased. 'ith cloc& fre0uencies approaching ! 123, arithmetic bloc&s must &eep pace with the continued demand for more computational power. The purpose of this thesis is to present methods of implementing the area and power efficient carry select adder. To reduce the power and area re0uirements of the computational comple%ities, the si3e of transistors are shrun& into the deep sub#micron region predominantly handled by process engineering. There are several /dder designs have been proposed to reduce the power consumption. Logic minimi3ation not only results in better system throughput but also results in low power consumption designs. 5or low power results it is always 4" and

Department of 676, 89ITS

!

Design of Low Power ALU using Area Efficient Carry Select Adder

advisable to use 78:S technology in which the power dissipation is a comple% function of the gate delays, cloc& fre0uency, process parameters, circuit topology and structure, and the input vectors applied. :nce the processing and structural parameters have been fi%ed, the measure of power dissipation is dominated by the switching activity +toggle count, of the circuit .The dynamic power is given by, ;<!=> ? 7load ? +Vdd>=Tcycle, ? 6+switching,, 'here 7load is the load capacitance of the gate, Tcycle is the cloc& cycle time, 6 +switching, is the e%pected number of signal transitions per cycle and Vdd is the supply voltage )".

1$% O&'ecti(e
To design a high speed /rithmetic Logic *nit +/L*, by using the efficient carry select adder. /dder is the important bloc& in /L*, speed of the /L* is limited by the adder because it has to pass carry to more number of bits. In digital adders, for speed up the operation 9ipple 7arry /dder +97/, is modified as 7SL/. To achieve more speed 7SL/ is replaces by S@9T 7SL/. The 7SL/ is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carries and then select a carry to generate the sum ("# !-". 2owever, the 7SL/ is not area efficient because it uses multiple pairs of 9ipple 7arry /dders +97/, to generate partial sum and carry input 7in<- and 7in<!, the final sum and carry are selected by the multiple%ers+mu%, !!"# !$". 1$%$1 E)isting S*RT Carry Select Adder In general the complete S@9T 7SL/ is divided into different bloc&s. Bloc& si3e and the number of bloc&s depend upon the si3e of S@9T 7SL/ according to the S@9T techni0ue. 5rom second bloc& onwards, each bloc& contains three different levels, first level is ripple carry adder with input carry 3ero, second level is ripple carry adder with input carry one and the third level is multiple%er which is used to select one of the ripple carry adders output according to the previous bloc& carry. The disadvantage in S@9T 7SL/ is more area re0uirement as it uses two levels of 97/s.

Department of 676, 89ITS

>

Design of Low Power ALU using Area Efficient Carry Select Adder

5or achieving better area efficiency !A"# !$" Binary to 6%cess#! 7onverter +B67, is replaced in the place of 97/ with 7in<! in the regular 7SL/. To replace n bit 97/ an nB! bit B67 is re0uired. Though B67 techni0ue reduces area and power modifications. The drawbac& with this logic structure is that it does not reduce the area and power to a satisfactory level. There is still scope to reduce the delay. In order to reduce the power and area a new logic structure for a B67 is proposed. 1$%$% Pro+osed S*RT Carry Select Adder The !.#bit S@9T 7SL/ using B67 in its second level re0uires 4(> transistors. There is a scope to reduce the number of transistors along with the area reduction and power dissipation reduction by using proposed logic. 5or the implementation of a !.#bit S@9T 7SL/, 4A. transistors are re0uired by using proposed logic. The proposed logic implementation for second level 97/ is Special 2ardware using 8ultiple%ers +S28,. In this the inputs are applied to first level 97/. /nd the output of 97/ is applied to second level S28 and then to third level multiple%er. Third level multiple%er selects either 97/ output or S28 output according to the previous carry. By using the proposed logic )#bit /rithmetic Logic *nit +/L*, which performs arithmetic operations such as addition, subtraction, increment and decrement and logical operations such as /CD, :9, D:9 and DC:9 is designed. !." but not up to

considerable amount and also the design is not suitable for sub threshold level

1$, Tools used
SO"T!ARELogic EditorE DS72>..c Layout EditorE 8icro wind >..a.

Department of 676, 89ITS

A

Design of Low Power ALU using Area Efficient Carry Select Adder

The performance of the proposed design is analy3ed. The simulations are performed with !>-nm+-.!>um, using simulation tool 8icrowind>, power supply of !.>V and nominal temperature of >4F7 to e%tract the critical path delay and power consumption.

1$. T/esis outline
The ne%t chapter describes literature survey such as different types of adders, different types low power design techni0ues in the design of low power /L* and different logic styles are analy3ed. 6%isting design such as )# bit /L* using ripple carry adders are designed in chapter A along with the implementation of S@9T 7SL/ using B67 techni0ue. 7hapter G describes implementation of proposed S@9T 7SL/ and proposed /L* using efficient carry select adder. 7omparative analysis and results are shown in the chapter $. 7onclusion and future scope are discussed in chapter ..

Department of 676, 89ITS

G

Design of Low Power ALU using Area Efficient Carry Select Adder

CHAPTER % L TERATURE SURVE0
%$1 ntroduction
In nearly all digital I7 designs today, the addition operation is one of the most essential and fre0uent operations. Instruction sets for DS;Hs and general purpose processors include at least one type of addition. :ther instructions such as subtraction and multiplication employ addition in their operations, and their underlying hardware is similar if not identical to addition hardware. :ften, an adder or multiple adders will be in the critical path of the design, hence the performance of a design will be often be limited by the performance of its adders. 'hen loo&ing at other attributes of a chip, such as area or power, the designer will find that the hardware for addition will be a large contributor to these areas. It is therefore beneficial to choose the correct adder to implement in a design because of the many factors it aspects in the overall chip. In this chapter we begin with the basic building bloc&s used for addition, then go through different algorithms and name their advantages and disadvantages.

%$% 1asic Adder 1loc2s

%$%$1

Half Adder The half adder is an e%ample of a simple, functional digital circuit built from

two logic gates. The half adder adds to one#bit binary numbers +/B,. The output is the sum of the two bits +S, and the carry +7,. Cote how the same two inputs are directed to two different gates. The inputs to the D:9 gate are also the inputs to the /CD gate. The input IwiresI to the D:9 gate are tied to the input wires of the /CD gateJ thus, when voltage is applied to the / input of the D:9 gate, the / input to the /CD gate receives the same voltage. >.! >.>

Department of 676, 89ITS

$

Design of Low Power ALU using Area Efficient Carry Select Adder

5ig.>.! 2alf adder %$%$% "ull Adder In electronics, an adder is a digital circuit that performs addition of numbers. 5ull adders are fundamental units in various circuits, especially in circuits used for performing arithmetic operations such as compressors, comparators, parity chec&ers, and arithmetic logic units and so on. The full adder ta&es into account a carry input such that multiple adders can be used to add larger numbers. To remove ambiguity between the input and output carry lines, the carry in is labeled 7in while the carry out is labeled 7out. The full#adder circuit adds three one#bit binary numbers +7in, /, B, and outputs two one#bit binary numbers, a sum +S, and a carry +7out,. The full#adder is usually a component in a cascade of adders, which add ), !., A>, etc. binary numbers. The carry input for the full#adder circuit is from the carry output from the circuit IaboveI itself in the cascade. The carry output from the full adder is fed to another full adder IbelowI itself in the cascade. 2ence, a full adder is a digital circuit that performs an addition operation on three binary digits. The full adder produces a sum and carries value, which are both binary digits. It can be combined with other full adders or wor& on its own.

A CO

1 C3

14&it "ull Adder S

5ig.>.> Schematic Symbol of !#bit full#adder cell

Department of 676, 89ITS

.

Design of Low Power ALU using Area Efficient Carry Select Adder

The final :9 gate before the carry#out output may be replaced by an D:9 gate without altering the resulting logic. This is because the only discrepancy between :9 and D:9 gates occurs when both inputs are !J for the adder shown here, one can chec& this is never possible. *sing only two types of gates is convenient if one desires to implement the adder directly using common I7 chips. / full adder can be constructed from two half adders by connecting / and B to the input of one half adder, connecting the sum from that to an input to the second adder, connecting 7i to the other input and or the two carry outputs. 60uivalently, S could be made the three#bit D:9 of /, B, and 7i and 7o could be made the three#bit maKority function of /, B, and 7i. The output of the full adder is the two#bit arithmetic sum of three one#bit numbers.

5igure >.A 7ircuit diagram of !#bit full#adder cell

>.A >.G %$%$, Partial "ull Adder

Department of 676, 89ITS

4

Design of Low Power ALU using Area Efficient Carry Select Adder

The ;artial 5ull /dder +;5/, is a structure that implements intermediate signals that can be used in the calculation of the carry bit. It is an e%tension of 5/ which include the signals generate +g,, &ill +&,, and propagate +p,.'hen g<!, it means carryout will be ! +generated, regardless of carry#in. 'hen &<!, it means carryout will be - +&illed, regardless of carry#in. 'hen p<!, it means carryout will e0ual carry#in +carry#in will be propagated,. Table >.! reflects these three additional signals, with a comment on the carryout bit in an additional column. 60uations >.$ L >.4 are the Boolean e0uations for generate, &ill, and propagate, respectively. It should be noted that for the propagate signal, the D:9 function can also be used, since in the case of a, b<!, the generate signal will assert that carryout is !. The Boolean e0uations for the sum and carryout can now be written as functions of g, p, or & shown by 60uations >.) and >.(. 5igure >.G shows a circuit for creating the 1enerate, ;ropagate, and Sum signals. It is a partial full adder because it does not calculate the carryout signal directlyJ rather, it creates the signals needed to calculate the carryout signal. 1eneratei +gi, < ai . bi Milli +&i, < ai . bi ;ropagatei +pi, < ai bi Sumi < ;i 7ini 7arry#outiB! < ai . bi B bi . carry#ini Bai .carry#ini >.4 >.) >.( >.$ >..

Department of 676, 89ITS

)

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure >.G 1eneration of 16C69/T6, ;9:;/1/T6 and S*8

Table >.! Truth table of partial full adder n+uts Carry4in ! ! ! ! a ! ! ! ! 1 ! ! ! ! Carry4out ! ! ! ! Su5 ! ! ! ! 6 ! ! Out+uts 7 ! ! + ! ! ! ! ! ! Carry4status delete propagate propagate generate=propagate delete propagate propagate generate=propagate

%$, Adder Algorit/5s
%$,$1 Ri++le Carry Adder

The 9ipple 7arry /dder +97/, is one of the simplest adders to implement. This adder ta&es in two C#bit inputs +where C is a positive integer, and produces +C B !, output bits +an C#bit sum and a !#bit carryout,. The 97/ is built from C full adders cascaded together, with the carryout bit of one 5/ tied to the carry#in bit of the ne%t 5/. 5igure >.$ shows the schematic for an C#bit 97/. The input operands are labeled NaH and NbH the carryout of each 5/ is labeled 7 out +which is e0uivalent to the carry#in +c#in, of the subse0uent 5/,, and the sum bits are labeled sum. 6ach sum bit re0uires both input operands and 7in before it can be calculated. To estimate the propagation delay of this adder, we should loo& at the worst case delay over every possible combination of inputs. This is also &nown as the critical path. The most significant sum bit can only be calculated when the carryout of the previous 5/ is &nown. In the worst case +when all the carry#outHs are !,, this carry bit needs to ripple across the

Department of 676, 89ITS

(

Design of Low Power ALU using Area Efficient Carry Select Adder

structure from the least significant position to the most significant position. 5igure >.. has a dar&ened line indicating the critical path. 2ence, the time for this implementation of the adder is e%pressed in 60uation >.!-, where t97/carry is the delay for the carryout of a 5/ and t the sum of a 5/. ;ropagation Delay +t97/group, < +C#!, . t97/carry B t97/sum >.!97/sum

is the delay for

5rom 60uation >.!-, we can see that the delay is proportional to the length of the adder. /n e%ample of a worst case propagation delay input pattern for a G bit ripple carry adder is where the input operands change from !!!! and ---- to !!!! and ---!, resulting in a sum changing from -!!!! to !----. 5rom a VLSI design perspective, this is the easiest adder to implement. :ne Kust needs to design and layout one 5/ cell, and then array C of these cells to create an C#bit 97/. The performance of the one 5/ cell will largely determine the speed of the whole 97/. 5rom the critical path in 60uation >.!-, minimi3ing the carryout delay +t97/carry, of the 5/ will minimi3e t 5/ cell to minimi3e the carryout delay .
97/prop

. Various implementations of the

5igure >.$ Schematic for an C#bit 9ipple 7arry /dder

Department of 676, 89ITS

!-

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure >.. 7ritical paths for an C#bit 9ipple 7arry /dder %$,$% Carry S2i+ Adder 5rom e%amination of the 97/, the limiting factor for speed in that adder is the propagation of the 7out bit. The 7arry S&ip /dder +7SM/, also &nown as the 7arry Bypass /dder, addresses this issue by loo&ing at groups of bits and determines whether this group has a carryout or not. This is accomplished by creating a group propagate signal +;7SM/group, to determine whether the group carry#in +carry#in will propagate across the group to the carryout +carry#out
7SM/group 7SM/group

,

,. To e%plore the

operation of the whole 7SM/, ta&e an C#bit adder and divide it into C=8 groups, where 8 is the number of bits per group. 6ach group contains a >#to#! multiple%er, logic to calculate 8 sum bits, and logic to calculate ; 7SM/group. The select line for the mu% is simply the ;7SM/group signal, and it chooses between carry#in7SM/group or cout G. To aid the e%planation, we refer the reader to 5igure >.4, which shows the hardware for a group of G bits +8<G, in the 7SM/. There are four full adders cascaded together and each 5/ creates a carryout +cout,, a propagate +p, signal, and a sum +sum not shown,. The propagate signal from each 5/ comes at no e%tra hardware cost since it is calculated in the sum logic +the hardware is identical to the sum hardware for the ;5/ shown in 5igure >.G,. 5or the carry#out 7SM/group to e0ual carry#in
7SM/group

, all of the individual propagates must be asserted +60uations >.!! and >.!>,. If

this is true then carry#in7SM/group s&ipsI past the group of full adders and e0uals the carryout 7SM/group. 5or the case where ;7SM/group is -, at least one of the propagate signals is -. This implies that either a delete and=or generate occurred in the group. / delete signal simply means that the carryout for the group is - regardless of the carry# in, and a generate signal means that the carryout is ! regardless of the carry#in. This is advantageous because it implies that the carry#out for the group is not dependent on the carry#in. Co hardware is needed to implement these two signals because the group carryout signal will reflect one of the three cases +a d, g or group p occurred,. The additional hardware to reali3e the group carryout in 5igure >.4 is accomplished with a G#input /CD gate and a >#to#! multiple%er +mu%,. In general, an 8#input /CD

Department of 676, 89ITS

!!

Design of Low Power ALU using Area Efficient Carry Select Adder

gate and a >#to#! mu% are re0uired for a group of bits, including the logic to calculate the sum bits. ;7SM/group < ;- . ;! . ;> . ;A 7arry#out7SM/group < 7arry#in7SM/group . ;7SM/group >.!! >.!>

In e%amining the critical path for the 7SM/, we are primarily concerned whether the carry#in can be propagated +Os&ippedI, across a group or not. /ssuming all input bits come into the adder at the same time, each group can calculate the group propagate signal +mu% select line, simultaneously. 6very mu% then &nows which signal to pass as the carryout of the group. There are two cases to consider after the mu% select line has determined. In the first case, carry#in 7SM/ group will propagate to the carryout. This means ; 7SM/group<! and the carryout is dependent on the carry#in. In the second case, the carryout signal of the most significant adder will become the group carryout. This means ;7SM/ group <- and the carryout is independent of the carry#in. If we isolate the particular group +as in 5igure >.4,, the second case +signal coutG, always ta&es longer because the carryout signal must be calculated through logic, whereas the first case +carry#in7SM/group, re0uires only a wire to propagate the signal. Loo&ing at the whole architecture, however, this second case is part of the critical path for only the first 7SM/ group. Since the second case is not dependent on the group carry#in, all the groups in the 7SM/ can compute the carryout in parallel. If a group needs its carry#in +;7SM/group<!,, then it must wait until it arrives after being calculated from a previous group. In the worst case, a carryout must be calculated in the first group, and every group afterwards needs to propagate this carryout. 'hen the final group receives this propagated signal, then it can calculate its sum bits. 5igure >.) shows a !.#bit 7SM/ with G#bit groups and 5igure >.( shows a dar&ened line indicating the critical path of the signals in the !.#bit 7SM/. If we assume a !.#bit 7SM/ with G#bit groups, with each group containing a G#bit 97/ for the sum logic, then the worst case propagation delay through this adder is e%pressed in e0uation >.!A. In this e0uation, t97/carry and t97/sum are the delays to calculate the carryout and sum signals of an 97/, respectively. 6ach group has G bits, so the delay through the first group has G 97/ carryout delays. This

Department of 676, 89ITS

!>

Design of Low Power ALU using Area Efficient Carry Select Adder

carryout of the first group potentially propagates through A mu%es, where one mu% delay is e%pressed as t mu%delay. 5inally, when the carryout signal reaches the final group, the sum for this group can be calculated. This is represented by the final two components of 60uation >.G.

5igure >.4 :ne group in a 7arry S&ip /dder, in this case 8<G

5igure >.) / !.#bit 7arry S&ip /dder C<!., 8<G

5igure >.( 7ritical path through !.#bit 7SM/

Department of 676, 89ITS

!A

Design of Low Power ALU using Area Efficient Carry Select Adder

t7SM/!.< G ? t97/carry B A ? tmu%delay B A ? t97/carry B t97/sum

>.!A

5or 60uation >.!A, there are some assumptions about the delay through the circuit. 5irst, we assume in the first 7SM/ group that the group propagates signal is calculated before the carryout of the most significant adder. Thus, the mu% for this first group is waiting for the carryout. 5or the final 7SM/ group, we assume that it ta&es longer for sum!$ to be calculated than for sum!. to be calculated. :nce the carry#in for this last group is &nown, the delay for sum!. is the delay of the mu%J for sum !$ it is a delay of A?t97/carry B t 97/sum +A ripples through the adder before the last sum bit can be calculated,. 5or an C#bit 7SM/, the critical path e0uation is e%pressed in 60uation >.$. 8 represents the number of bits in each group. There are C=8 groups in the adder, and every mu% in this group e%cept for the last one is in the critical path. /s in 60uation >.!A, 60uation >.!G assumes that each group contains a ripple carry adder. t7SM/C < 8 ? t97/carry B+ +C=8,#!,tmu%delay B +8#!, ? t97/carry B t97/sum >.!G

5rom a VLSI design perspective, this adder shows improved speedup over a 97/ without much area increase. The additional hardware comes from the >#to#! mu% and group propagates logic in each group, which is about !$P more area. :ne drawbac& to this structure is that its delay is still linearly dependent on the width of the adder, therefore for large adders where speed is important, the delay may be unacceptable. /lso, there is a long wire in between the groups that carryout 7SM/group needs to travel on. This path begins at the carryout of the first 7SM/ group and ends at the carry#in to the final 7SM/ group. This signal also needs to travel through ++C=8,#!,, mu%es, and these will introduce long delays and signal degradation if pass gate mu%es are used. If buffers are re0uired in between these groups to reproduce the signal, then the critical path is lengthened. /n e%ample of a worst case delay input pattern for a !.#bit 7SM/ with G#bit groups is where the input operands are !!!!!!!!!!!!!--- and ------------!---. This forces a

Department of 676, 89ITS

!G

Design of Low Power ALU using Area Efficient Carry Select Adder

carryout in the first group that s&ips through the middle two groups and enters the final group. This carry#in to the final group ripples through to the final sum bit +sum!$,. To determine the optimal speed for this adder, one needs to find the delay through a mu% and the carryout delay of a 5/. It is one of these two delays that will dominate the delay of the whole 7SM/. 5or short adders +Q !. bits,, the t carryout of a 5/ will probably dominate delay, and for long adders the long wire that s&ips through stages and mu%es will probably dominate the delay. %$,$, Carry Loo2 A/ead Adder 5rom the critical path e0uations in Sections >.>.! and >.>.>, the delay is linearly dependent on C, the length of the adder. It is also shown in 60uations >.!and >.!G that the tcarryout signal contributes largely to the delay. /n algorithm that reduces the time to calculate tcarryout and the linear dependency on C can greatly speed up the addition operation. 60uation >.( shows that the carryout can be calculated with g, p, and carry#in. The signals g and p are not dependent on carry#in, and can be calculated as soon as the two input operands arrive. 'einberger and Smith invented the 7arry Loo& /head +7L/, /dder !(". *sing 60uation >.(, we can write the carryout e0uations for a G#bit adder. These e0uations are shown in 60uations >.!$L>.!), where 7i represents the carryout of the ith position +- Q i Q +C L !,,, and gi with Kust the input operands and initial carry#in +cA,. This process of calculating ci by using only the pi, gi and c- signals can be done indefinitely, however, each subse0uent carryout. 1enerated in this manner becomes increasingly difficult because of the large number of high fan#in gates >-". 7! < g- B p- .c7> < g! B p! .c! < g! B p! . g- B p! . p- . c7A < g> B p> .c> < p> . g! Bp> . p! . g- Bp> . p! . p- . c7G < gA B pA .cA <gA B pA.g> B pA . p> . g! BpA . p> . p! . g- BpA . p> . p! . p- . c>.!) >.!$ >.!. >.!4

The 7L/ adder uses partial full adders as described in Section >.!.A to calculate the 1enerate and propagate signals needed for the carryout e0uations. 5igure

Department of 676, 89ITS

!$

Design of Low Power ALU using Area Efficient Carry Select Adder

>.!- shows the schematic for a G#bit 7L/ /dder. The 7L/ logic bloc& implements the logic in 60uations >.!$L>.!), and the gate schematic for this bloc& is in 5igure >.!!. 5or a G#bit 7L/ adder the G th carryout signal can also be considered as the $ th sum bit. /lthough it is impractical to have a single level of carry loo&#ahead logic for long adders, this can be solved by adding another level of carry loo&#ahead logic. To achieve this, each adder bloc& re0uires two additional signalsE groups generate and a group propagates. The e0uations for these two signals, assuming adder bloc& si3es of G bits, are shown in 60uations >.!( and >.>-. / group generate occurs if a carry is generated in one of adder bloc&s, and a group propagate occurs if the carry#in to the adder bloc& will be propagated to the carryout. 5igure >.!! shows the gate schematic of the two additional signals. 1roup 1enerate < gA B pA.g> B pA . p> . g! BpA . p> . p! . cA 1roup ;ropagate < gA B pA.g> B pA . p> . g! BpA . p> . p! . cA >.!( >.>-

>.!( >.>- with multiple levels of 7L/ logic, carry loo&#ahead adders of any length can be built. To illustrate the use of another level of 7L/ logic, 5igure >.) shows the schematic for a !.#bit 7L/ /dder. There is a second level of 7L/ logic which ta&es the group generate and group propagate signals from each G#bit adder sub cell and calculates the carryout signals for each adder bloc&. If an adder has multiple levels of 7L/ logic, only the final level needs to generate the

Department of 676, 89ITS

!.

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure >.!- G#bit carry loo&#ahead adder cG signal. /ll other levels replace this cG signal with the group generate and group propagate. The 7L/ logic for this !.#bit adder is identical to the 7L/ logic for the G# bit adder in 5igure >.!!J therefore the e0uations for the carryout signals are in 60uations >.!$L>.!).

5igure >.!! Schematic for a !.#bit 7L/ adder / third level of 7L/ logic and four !.#bit adder bloc&s can be used to build a .G#bit adder. The 7L/ logic would create the c!., cA>, and cG) signals to be used as carry#ins to the !.#bit adder bloc&s and the c.G as the sum .G signal. If a design calls for an adder of length A>, a designer can simply use two !.#bit adder bloc&s and the first two carryout signals +c!., c A>, from the third level of 7L/ logic. The identical hardware in the 7L/ logic, coupled with the fact that the adder bloc&s can be instantiated as sub cells, ma&es building long adders with this architecture simple. Determining the critical path for a 7L/ adder is difficult because the gates in the carry path have different fan#in. To get a general idea, we first assume that all gate delays are the same. The delay for a G#bit 7L/ adder then re0uires one gate delay to calculate the propagate and generate signals, two gate delays to calculate carry signals, and one gate delay to calculate the sum signalsJ this e0uates to four gate delays. 5or a !.#bit 7L/ adder there is one gate delay to calculate the propagate and generate signal +from the ;5/,, two gate delays to calculate the group propagate and generate in the first level of carry logic, two gate delays for the carryout signals in the

Department of 676, 89ITS

!4

Design of Low Power ALU using Area Efficient Carry Select Adder

second level of carry logic, and one gate delay for the sum signals. The second level of carry logic for the !.#bit 7L/ adder contributes an additional two gate delays over the G#bit 7L/ adder, thus increasing the total to si% gate delays. 7ontinuing in this manner +a .G#bit add ta&es eight gate delays, a >$.# bit add ta&es ten gate delays,, we see that the delay for a 7L/ adder is dependent on the number of levels of carry logic, and not on the length of the adder. If a group si3e of four is chosen, then the number of levels in an C#bit 7L/ is e%pressed in 60uation >.>! and in general the number of levels in a 7L/ for a group si3e of & is e%pressed in 60uation >.>>. 5or an C#bit 7L/ adder, each level of carry logic introduces two gate delays in addition to a gate delay for the generate and propagate signals and a gate delay for the sum. The total gate delay is e%pressed in 60uation >.>A, which shows that the delay of a 7L/ adder is logarithmically dependent on the si3e of the adder. This theoretically results in one of the fastest adder architectures. 7L/ levels +with group si3e of G, < log G C" 7L/ levels +with group si3e of &, < log & C" 7L/ gate delay < > B > . log & C" >.>! >.>> >.>A

5rom a VLSI design perspective, this adder may ta&e more time to implement, but there still e%ists regularity with the architecture that allows building long adders fairly easily. The reuse of the 7L/ logic definitely contributes to the feasibility of building a long adder without additional design time. /lso, after an adder is built, it can be used as a subcell, as is done with the G#bit adders as bloc&s in the !.#bit 7L/ adder. / drawbac& to 7L/ adders are their larger areas. There is a large amount of hardware dedicated to calculating the carry bits from cell to cell. 2owever, if the application calls for high performance, then the benefits of decreased delay can outweigh the larger area. %$,$. Carry Select Adder /dding two numbers by using redundancy can speed addition even further. That is, for any number of sum bits we can perform two additions, one assuming the carry#in is ! and one assuming the carry#in is -, and then choose between the two results once the actual carry#in is &nown. This scheme, proposed by S&lans&i in !(.-,

Department of 676, 89ITS

!)

Design of Low Power ALU using Area Efficient Carry Select Adder

is called conditional#sum addition >!". /n implementation of this scheme was first reali3ed by BedriK and is called the 7arry Select /dder +7SL/, >>". The 7SL/ divides the adder into bloc&s that have the same input operands e%cept for the carryin. 5igure >.!> shows a possible implementation for a !.#bit 7SL/ using ripple carry adder bloc&s. The carryout of the first bloc& is used as the select line for the (#bit >#to#! mu%. The second and third bloc&s calculate the signals sum !. L sum ) in parallel, with one bloc& having its carryin hardwired to - and another hardwired to !. /fter one )#bit ripple adder delay there is only the delay of the mu% to choose between the results of bloc& > or A. 60uation >.>G shows the delay for this adder. The !.#bit 7SL/ can also be built by dividing it into even more bloc&s. 5igure >.!A shows the bloc& diagram for the adder if it were divided into G#bit 97/ bloc&s. 60uation >.>$ e%presses the delay for this structure. t7SL/!.a <t)bit97/ B t(bitmu% t7SL/!.b <tGbit97/ B A . t$bitmu% >.>G >.>$

The 7SL/ described so far is called the Linear 7arry Select /dder, because its delay is linearly dependent on the length of the adder. In the worst case, the carry signal

5igure >.!> Schematic for a !.#bit 7SL/ with )#bit 97/ bloc&s

Department of 676, 89ITS

!(

Design of Low Power ALU using Area Efficient Carry Select Adder

must ripple through each mu% in the adder. /lso, notice that the sub cells are done with their addition at the same time, yet the more significant bits are waiting at the input of the mu% to be selected. 5rom a VLSI design perspective, the 7SL/ uses a large amount of area compared to the other adders. There is hardware in this architecture which computes results that are thrown away on every addition, but the

5igure >.!A Schematic for a !.#bit 7SL/ with G#bit 97/ bloc&s

5act that the delay for an addition can be replaced by the delay of a mu% ma&es this architecture very fast. /lso, the Linear 7SL/ has regularity that ma&es it easier to layout. %$,$8 S*RT Carry Select Adder To increase S@9T techni0ue is developed. In this design the number of bits per bloc& is not depend upon the total number of bits corresponding logical e0uation is shown in >.>.. *sing that techni0ue for !.#bit S@9T 7SL/ the bits per bloc& is as follows >#>#>#A#G#$. 5or )#bit shown in figure >.!G. tadd< tsetupB +m D tcarry,Bs0rt +>n, D tmu% B tsum >.>. the se0uence is !#A#G. The !.# bit S@9T 7SL/ is

Department of 676, 89ITS

>-

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure >.!G Schematic for a !.#bit S@9T 7SL/

%$. Low +ower design tec/ni9ues
Designing systems aiming for low power is not a straightforward tas&, as it is involved in all the I7 design stages beginning with the system behavioral description and ending with the fabrication and pac&aging processes. In some of these stages there are guidelines that are clear and there are steps to follow that reduce power consumption, such as decreasing the power#supply voltage. 'hile in other stages there are no clear steps to follow, so statistical or probabilistic heuristic methods are used to estimate the power consumption of a given design. There are three maKor components of power dissipation in complementary metalRo%ideRsemiconductor +78:S, circuits. !, Switc/ing PowerE ;ower consumed by the circuit node capacitances during transistor switching. >, S/ort Circuit PowerE ;ower consumed because of the current flowing from power supply to ground during transistor switching. A, Static PowerE Due to lea&age and static currents. G, Dyna5ic PowerE /s given in e0uation >.! The first two components are referred to as dynamic power. Dynamic power constitutes the maKority of the power dissipated in 78:S VLSI circuits. It is the power dissipated during charging or discharging the load capacitances of a given

Department of 676, 89ITS

>!

Design of Low Power ALU using Area Efficient Carry Select Adder

circuit. It depends on the input pattern that will either cause the transistors to switch +consume dynamic power, or not to switch +no dynamic power consumed, at every cloc& cycle. The summation is over all the nodes of the circuit. 9educing any of these components will end up with lower#power consumption, although, it is of e0ual importance to increase the system#cloc& fre0uency for faster operation. 6stimating the power of a large circuit is a comple% tas&. 2euristic algorithms, statistical, and probabilistic methods are used to generate random#input patterns to test the switching activity of the circuit. These methods become less accurate when the si3e of the circuit increases. It is better to decompose the large circuit into smaller modules and then use these methods to estimate the power consumption of each module. 'hen the decomposed modules are small enough, e%act methods can be used to optimi3e their performance. %$.$1 Transistor si:ing o+ti5i:ation The transistor si3ing for optimal performance is technology dependent. /s the demand for high speed, low power consumption and high pac&ing density continues to grow each year, there is need to scale the device to smaller dimensions. /s the mar&et trend moves towards greater scale of integration, the move towards a reduced supply voltage also has the advantage of improving the reliability of I7 components of ever#reducing dimensions. This change can be easily understood if one recalls that I7 component with smaller dimensions have more of a tendency to brea&down at high voltages. It has already been accepted that scaled#down 78:S devices even at >.$V do not sacrifice device performance as they maintain device reliability. Scaling brings about the following benefitsE Improved device characteristics for low voltage operation due to the improvement in the current driving capabilities, reduced capacitance through small geometries and Kunction capacitances, improved interconnect technology, higher density of integration.

Department of 676, 89ITS

>>

Design of Low Power ALU using Area Efficient Carry Select Adder

The maKor device problem associated with simple scaling lies in the increase of the threshold voltage and the decrease of the carrier surface mobility, when the substrate doping concentration is increased to prevent punch#through. %$.$% Low4+ower cloc2 distri&ution The cloc& networ& constitutes one of the most important parts of a synchronous very large scale integration +VLSI, chip as it can significantly influence the speed, area, and power dissipation of the system. 9ecent research on cloc& networ& construction has developed procedures for building a 3ero or near#3ero s&ew cloc& networ&s with sharp cloc& edge rates at the cloc& utili3ation points. 2owever, one maKor drawbac& associated with cloc& networ&s is their power dissipation. Studies have shown that the cloc& networ& can dissipate >-R$-P of the total power on a chip. In the conte%t of the growing importance of low#power designs for portable electronics, it is necessary to develop strategies to significantly reduce the power dissipation of the cloc& networ&, since this will lead to a maKor reduction in the overall power dissipation of the chip. *sing a lower to distribute the signal over the chip, the cloc& networ& can be made to dissipate less power. 2owever, for reasons related to performance re0uirements, the rest of the circuitry on the chip may use a higher Vdd and this implies that the cloc& levels would have to be converted to this higher value at the utili3ation points. %$.$, Low +ower design t/roug/ (oltage scaling The e0uation +>.!, shows that the avg. switching power dissipation is proportional to the s0uare of the power supply voltageJ hence, reduction of V DD will significantly reduce the power consumption. If the power supply voltage is scaled down while all other parameters are &ept constant, the propagation delay time would increase. The dependence of circuit speed on the power supply voltage and the above e0uation. Suggest that a 0uadratic improvement or reduction of power consumption is possible as the power supply voltage is reduced. If the circuit is always operated at ma%imum fre0uency allowed by its propagation delay, the operating fre0uency or the no. of switching events per unit

Department of 676, 89ITS

>A

Design of Low Power ALU using Area Efficient Carry Select Adder

time will drop as the propagation delay becomes larger with the reduction of power supply voltage. The net result is that the dependence of switching power dissipation on the power supply voltage becomes stronger than a simple 0uadratic e0uation. The propagation delay e%pressions show that the negative effect of reducing the power supply voltage upon delay can be compensated for, if the threshold voltage of the transistors +VT, is scaled down accordingly. 2owever, this approach is limited because the threshold voltage may not be scaled to the same e%tent as the supply voltage. 'hen scaled linearly, reduced threshold voltages allow the circuit to produce the same speed performance at a lower VDD. %$.$. Reduction of switc/ing acti(ity Switching activity can be reduced by algorithmic optimi3ation, proper choice of logic topology, glitch reduction, and gated cloc& signals. /lgorithmic optimi3ation This depends heavily on the application and the characteristics of data such as dynamic range, correlation, and statistics of data transmission. The representation of data can have a significant impact on switching activity at the system level. In applications where data bits change se0uentially and are highly correlated, the use of 1ray 7oding leads to a reduced number of transitions compared to binary coding. /nother e%ample is the use of sign#magnitude representation instead of conventional twoHs complement representation for signed data. / change in sign will cause transitions of the higher order bits in the twoHs complement representation, whereas only the sign bit will change in sign#magnitude representation. 2ence, switching activity can be reduced by using the sign#magnitude representation in applications where the data sign changes are fre0uent. 1litch reduction /n important architecture level measure to reduce switching activity is based on delay balancing and reduction of glitches. In multi#level logic circuits, the propagation delay from one logic bloc& to the ne%t can cause spurious signal transitions ,or glitches .1litches occur primarily due to a mismatch or imbalance in the path lengths in the logic networ&. Such a mismatch in path lengths results in a

Department of 676, 89ITS

>G

Design of Low Power ALU using Area Efficient Carry Select Adder

mismatch of signal timing with respect to the primary inputs. 9edesigning the logic networ& in order to balance the delay paths can significantly reduce glitches, and conse0uently, the dynamic power dissipation in comple% multi#level networ&s. 1ated 7loc& Signals /nother effective design techni0ue for reducing the switching activity in 78:S logic circuits is the use of conditional or gated cloc& signals. If certain logic bloc&s in a system are not immediately used during the current cloc& cycle, temporarily disabling the cloc& signals of these bloc&s will obviously save switching power that is otherwise wasted. /n C#bit number comparator compares the magnitudes of two unsigned C#bit binary numbers and produces an output to indicate which one is larger. In the conventional approach, all input bits are first latched into two C#bit registers, and subse0uently applied to the comparator circuit .In this case, two C#bit register arrays dissipate power in every cycle. Set, if only the most significant bits of the two binary numbers are different from each other, then the decision can be made by comparing the 8SBs only. The two 8SBs are latched in a two#bit register which is driven by the original system cloc&. /t the same time, these two bits are applied to an DC:9 gate and its output is used to generate the gated cloc& signal with an /CD gate. If the two 8SBs are different, the DC:9 produces logic - at the output, disabling the cloc& signal of the lower order registers. If the two 8SBs are same, the gated cloc& signal is applied to the lower#order registers and the decision is made by the +C#!, bit comparator. The gated cloc& strategy effectively reduces the overall switching power dissipation of the system by about $-P, since a large portion of the system is disabled for half of all input combinations. %$.$8 Reduction of switc/ing ca+acitance The amount of switched capacitance plays a significant role in the dynamic power dissipation of the circuit. 2ence, reduction of this parasitic capacitance is a maKor goal for low#power design of digital integrated circuits. System#Level 8easures /t the system level, one approach to reduce the switched capacitance is to limit the use of shared resources. If a single shared bus is connected to all modules,

Department of 676, 89ITS

>$

Design of Low Power ALU using Area Efficient Carry Select Adder

for e%ample, a large bus capacitance comes into play due to#the large number of drivers and receivers sharing the same transmission medium, and the parasitic capacitance of the long bus line. :bviously, driving the large capacitance will re0uire a significant amount of power consumption during each bus access. /lternatively, the global bus structure can be partitioned into a number of smaller dedicated local buses to handle the data transmission between the neighboring modules. /s a result, the switched capacitance during each bus access is significantly reduced, although multiple buses may increase the overall routing area on the chip. 7ircuit#Level 8easures The type of logic style used to implement a digital circuit also affects the output load capacitance of the circuit. The capacitance of a function of the number of transistors that are re0uired to implement a given function. ;ass#gate logic design is attractive since fewer transistors are re0uired for certain functions such as D:9 and DC:9. ;ass#transistor structures typically re0uire complementary control signalsJ dual#rail logic is used to provide all signals in complementary form. This diminishes the inherent advantages of pass#transistor logic gates over conventional 78:S logic. Thus, the use of pass#transistor logic gates to achieve low#power dissipation must be carefully considered, and the choice of logic design style must ultimately be based on a detailed comparison of all design aspects such as silicon area, overall delay as well as switching power dissipation. 8as&#Level 8easures The amount of parasitic capacitance that is switched +i.e., charged up or charged down, during operation can also be reduced at the physical design level, or mas& level. / simple mas&#level measure to reduce power dissipation is &eeping the transistors at minimum dimensions whenever possible and feasible, thereby minimi3ing the parasitic capacitances. Designing a logic gate with minimum#si3e transistors certainly affects the dynamic performance of the circuit, and this trade#off between dynamic performance and power dissipation should be carefully considered in critical circuits.

Department of 676, 89ITS

>.

Design of Low Power ALU using Area Efficient Carry Select Adder

%$. Different logic styles
Several variants of static 78:S logic styles have been used to implement low# power !#bit adder cells. Several logic styles have been used to design full adder cells. 6ach design style has its own merits and demerits. In general, they can be broadly divided into two maKor categoriesE !, Static logic style and >, Dynamic logic style / maKor distinction, also with respect to power dissipation, must be made between static and dynamic logic styles. /s opposed to static gates, dynamic gates are cloc&ed and wor& in two phases, a precharge and an evaluation phase. The logic function is reali3ed in a single C8:S pull#down or ;8:S pull#up networ&, resulting in small input capacitances and fast evaluation times. This ma&es dynamic logic attractive for high speed applications. 2owever, the large cloc& loads and the high signal transition activities due to the precharging mechanism result in e%cessive high power dissipation. /lso, the usage of dynamic gates is not as straightforward and universal as it is for static gates, and robustness is considerably degraded. 'ith the e%ception of some very special circuit applications, dynamic logic is no viable candidate for low#power circuit design. /lthough they all perform the same function, their styles of generating the intermediate nodes and the outputs are different, the loads on the inputs and intermediate nodes are different, and the transistor count varies significantly. There are standard implementations for the full adder cell that are implemented. They are the followingE !, Double pass transistor logic uses both C and ; channel transistors, with dual logic paths for every function. It uses >) transistors. >, The complementary pass#transistor logic +S9#7;L, full adder, it has >. transistors and uses the 7;L logic family. A, 8ultiple%er based low power full adder which ma&es use of AG transistors, it ma&es use of only multiple%er operation.

Department of 676, 89ITS

>4

Design of Low Power ALU using Area Efficient Carry Select Adder

/ll these adder cells are compared based on power consumption, speed, power delay product, area, and driving capability. 7lassical designs of full adders normally use only one logic style for the whole full#adder design. 'hile other hybrid designs e%ploit the features of different logic styles to improve upon the performance of the designs using single logic style. /ll hybrid designs use the best available modules implemented using different logic styles or enhance the available modules in an attempt to build a low power full#adder cell. 1enerally, the main focus in such attempts is to reduce the numbers of transistors in the adder cell and, conse0uently, reduce the number of power dissipating nodes. In doing so, the designers often trade off other vital re0uirements such as driving capability, noise immunity, and layout comple%ity. 8ost of these adders lac& driving capabilities as the inputs are coupled to the outputs. Their performance as a single unit or in small chains is good but when large adders are built by cascading these !#b full# adder cells, the performance degrades drastically. The performance degradation can be handled by inserting buffers in between stages to enhance the delay characteristics. 2owever, this leads to an e%tra overhead and the initial advantage of having a lesser number of transistors is lost.

Department of 676, 89ITS

>)

Design of Low Power ALU using Area Efficient Carry Select Adder

CHAPTER , DES 63 O" ALU A3D S*RT CSLA
,$1 ntroduction to ALU and S*RT CSLA
The arithmetic logic unit +/L*, is one of the main components inside a microprocessor. It is responsible for performing arithmetic and logic operations such as addition, subtraction, increment, and decrement, logical /CD, logical :9, logical D:9 and logical DC:9. /n /L* is a digital circuit that performs arithmetic and logical operations. 1enerally the performance of /L* is degraded by adder because of carry propagation. To reduced carry propagation delay so many adders are proposed. In digital adders, for speed up the operation 9ipple 7arry /dder +97/, is modified as 7SL/. To achieve more speed 7SL/ is replaced by S@9T 7SL/. The 7SL/ is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carries and then select a carry to generate the sum )"# (". 2owever, the 7SL/ is not area efficient because it uses multiple pairs of 9ipple 7arry /dders +97/, to generate partial sum and carry input 7in<- and 7in<!, the final sum and carry are selected by the multiple%ers +mu%,. 5or achieving better area efficiency !-"# !G" Binary to 6%cess#! 7onverter +B67, is replaced in the place of 97/ with 7in<! in the regular 7SL/. The total !.#bit S@9T 7SL/ is divided into different bloc&s. Bloc& si3e and the number of bloc&s depend upon si3e of S@9T 7SL/ according to the S@9T techni0ue. 5rom second bloc& onwards, each bloc& contains three different levels, first level is ripple carry adder with input carry 3ero, second level is ripple carry adder with input carry one and the third level is multiple%er which is used to select one of the ripple carry adders output according to the previous bloc& carry. The disadvantage in S@9T 7SL/ is more area re0uirement as it uses two levels of 97/s. To reduce the area B67 is replaced in place of second level 97/. In place of >#bit 97/, A# bit B67 is used.

Department of 676, 89ITS

>(

Design of Low Power ALU using Area Efficient Carry Select Adder

,$1$1 Delay and Area e(aluation 5et/odology of t/e &asic adder &loc2s The /CD, :9, and Inverter +/:I, implementation of an D:9 gate is shown in fig A.! we add up the number of gates in the longest path of area evaluation approach, the 7SL/ adder bloc&s of >E! mu%, 2alf /dder +2/,, and 5/ are evaluated and listed in Table A.!.

Table A.! Delay and area for basic gates

5igure A.! /:I implementation of D:9 gate ,$1$% 1inary to E)cess one Con(erter ;1EC< /s stated above the main idea of this wor& is to use B67 instead of the 97/ with cin <! in order to reduce the area and power consumption of the regular 7SL/. To replace the n#bit 97/, an +nB!,#bit B67 is re0uired. / structure and the function table of a G#b B67 are shown in 5ig.A.!.> and Table A.!.>, respectively.

Department of 676, 89ITS

A-

Design of Low Power ALU using Area Efficient Carry Select Adder

5ig. A.> illustrates how the basic function of the 7SL/ is obtained by using the G#bit B67 together with the mu%. :ne input of the )EG mu% gets as it input +BA, B>, B!, and B-, and another input of the mu% is the B67 output. This produces the two possible partial results in parallel and the mu% is used to select either the B67 output or the direct inputs according to the control signal 7in. The importance of the B67 logic stems from the large silicon area reduction when the 7SL/ with large number of bits are designed. The Boolean e%pressions of the G#bit B67 is listed as +note the functional symbols C:T, T/CD, D:9,

5ig A.> / G# bit B67

5ig A.A 5unctional bloc& of 7SL/

Department of 676, 89ITS

A!

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure A.G Bloc& diagram for a !.#bit S@9T 7SL/ ,$1$, Delay and area e(aluation 5et/odology of regular 1=4&it S*RT CSLA The structure of the !.#b regular S@9T 7SL/ is shown in 5ig. A.G. It has five groups of different si3e 97/. The delay and area evaluation of each group in which the numerals within " specify the delay values, e.g., sum> re0uires !- gate delays. The steps leading to the evaluation are as follows. !, The group> has two sets of >#b 97/. Based on the consideration of delay values of Table A.> , the arrival time of selection input c! time +t, <4" of .EA mu% is earlier than sA t<)" and later than s> t<.". Thus, sumA t<!!" is summation of sA and mu% t<A" and sum> t<!-" is summation of c! and mu%.
>,

6%cept for group>, the arrival time of mu% selection input is al ways greater than the arrival time of data outputs from the 97/Hs. Thus, the delay of groupA to group$ is determined, respectively as followsE

A, The one set of >#b 97/ in group> has > 5/ for and the other set has ! 5/ and ! 2/ for. Based on the area count of Table I, the total number of gate counts in group> is determined as followsE G, Similarly, the estimated ma%imum delay and area of the other groups in the regular S@9T 7SL/ are evaluated and listed in Table A.>.

Department of 676, 89ITS

A>

Design of Low Power ALU using Area Efficient Carry Select Adder

Table A.> Delay and area for S@9T 7SL/

,$1$. Delay and area e(aluation 5et/odology of 5odified 1=4&it S*RT CSLA The structure of the proposed !.#b S@9T 7SL/ using B67 for 97/ with to optimi3e the area and power is shown in 5ig. A.$. 'e again split the structure into five groups. The delay and area estimation of each group are shown in 5igure.

5igure A.$ / !.#bit S@9T 7SL/ using B67 !, The group> has one >#b 97/ which has ! 5/ and ! 2/ for carry input 3ero. Instead of another >#b 97/ with carry input one a A#bit B67 is used which adds one to the output from >#b 97/. The sumA and final +output from mu%, are depending on and mu% and partial +input to mu%, and mu%, respectively. The sum> depends on and mu%.

Department of 676, 89ITS

AA

Design of Low Power ALU using Area Efficient Carry Select Adder

>, 5or the remaining groupHs the arrival time of mu% selection input is always greater than the arrival time of data inputs from the B67Hs. Thus, the delay of the remaining groups depends on the arrival time of mu% selection input and the mu% delay. A, The area count of group> is determined as followsE

Table A.A Delay and area for modified S@9T 7SL/

,$1$8 Transistor Le(el design of e)isting tec/ni9ue

1< Con(entional full adder
/ conventional full adder ta&es >) transistors to implement sum and carry functions. The conventional full adder is shown in figure A.. %< A %4&it RCA / two bit 9ipple 7arry /dder +97/, is formed by connecting the two full adders. It ta&es total $. transistors to implement. It is shown in figure A.4.

Department of 676, 89ITS

AG

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure A.. / conventional full adder ,< A ,4&it 1EC / A# bit B67 uses two D:9, one /CD, one C:T gates, which ta&es A> transistors overall whereas >#bit 97/, which is the basic bloc& in place of A#bit B67 ta&es $. transistors. / A#bit B67 is shown in figure A.). comparison between >#bit 97/, A# bit B67 is shown in table A.G.

5igure A.4 / >#bit 97/ using conventional full adder

Department of 676, 89ITS

A$

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure A.) Transistor level A#bit B67 Table A.G 7omparison between >#bit 97/ and B67 Logic for Second Level Cumber of 7ritical transistors path delay +ns, 97/ using 78:S B67 using 78:S $. A> !.(-!.>-!AG> 4)! /rea +Um>, Static ..4-. A.>.( ;ower dissipation +Uw, Dynami c G>.$.$ >$.4G. Total G(.>4! >(.-!$

Though B67 techni0ue reduces area and power

!." but not up to

considerable amount and also the design is not suitable for sub threshold level modifications. The drawbac& with this logic structure is that it does not reduce the area and power to a satisfactory level. There is still scope to reduce the delay. In order to improve the delay a new logic structure for a full#adder cell is proposed.

Department of 676, 89ITS

A.

Design of Low Power ALU using Area Efficient Carry Select Adder

,$% ALU
The arithmetic logic unit +/L*, is one of the main components inside a microprocessor. It is responsible for performing arithmetic and logic operations such as addition, subtraction, increment, and decrement, logical /CD, logical :9, logical D:9 and logical DC:9. /n /L* is a digital circuit that performs arithmetic and logical operations. The /L* is a fundamental building bloc& of the 7entral ;rocessing *nit +7;*, of a computer, and even the simplest microprocessors contain one. The processors found inside modern 7;*s and 1raphics ;rocessing *nits +1;*s, have inside them very powerful /L*s. 'e have designed /L* by using multiple%er and full adder circuit. The input and output sections consist of G%l and >%l multiple%ers and logic is implemented by using full adder. The full adder performs the computing function of the /L*. / full adder could be defined as a combinational circuit that forms the arithmetic sum of three input bits. It consists of three inputs and two outputs. The arithmetic logic unit +/L*, is one of the main components inside a microprocessor. It is responsible for performing arithmetic and logic operations such as addition, subtraction, increment, decrement, logical /CD, logical :9 logical D:9 and logical DC:9. /n /L* is a digital circuit that performs arithmetic and logical operations. 'e have designed /L* using GDl mu%, >Dl mu% and an )T full adder. 2ere all the bloc&s in /L* are designed using 1ate Diffusion Input +1DI,. ,$%$1 6D Tec/ni9ue /S there is a scope to reduce power, area and delay using 1DI cell techni0ue / simple 1DI cell is shown in 5ig.A.(. 'e can implement any bullion function using 1DI cell. Low swing problems will arise, because we apply inputs directly to the sources of ; and C transistors. C transistor wea& to pass logic high and ; transistor wea& to pass logic low. 'hen transition occur from the high to low at the ; transistor source and the low to high at the C transistor source, low swing problem will arise. To avoid that demands special emphasis is that $-P of the cases, the 1DI cell operates as regular 78:S inverter, which is widely used as a digital buffer for logic#level restoration. In some of these cases , when Vdd<! without a swing drop from the

Department of 676, 89ITS

A4

Design of Low Power ALU using Area Efficient Carry Select Adder

previous stages, a 1DI cell functions as an inverter buffer and recovers the voltage swing. Basic logic gates are shown in figure A.!-.

5igure A.( Simple 1DI cell

5igure A.!- Basic logic gates 1DI cell ,$%$% A 1>4transistor full adder / full adder using 1DI techni0ue ta&es !- transistor where as conventional full adder ta&es >) transistors. It is shown in figure A.!!.

Department of 676, 89ITS

A)

Design of Low Power ALU using Area Efficient Carry Select Adder

,$%$, An ?4transistor full adder 5ull adder can implement with )#transistors by using 1DI techni0ue. / !transistor full adder differentiates the ) transistor full adder with two pull up transistors. It is shown in figure A.!>.

5igure A.!! / !-# transistor full adder

5igure A.!> / )# transistor full adder ,$%$. A 14&it ALU /L* is designed using multiple%ers and full adder circuit. The input and output sections consist of G%! and >%! multiple%ers and logic is implemented by using full adder. / set of three select signals have been incorporated in the design to

Department of 676, 89ITS

A(

Design of Low Power ALU using Area Efficient Carry Select Adder

determine the operation being performed and the inputs and outputs being selected. 5igure A.!A shows the bloc& diagram of !#bit /L* using two G%! multiple%ers and one >%! multiple%er. The complement of B is used for S*BT9/7TI:C operation. The full adder performs the S*BT9/7T operation by twoHs complement method. Table A.$ shows the truth table for the operations performed by the /L* based on the status of the select signals. Table A.$ Truth table of one bit /L* s% ! ! ! ! s1 ! ! ! ! s> ! ! ! ! O+eration /CD D:9 DC:9 :9 D679686CT /DDTI:C S*BT9/7TI:C IC79686CT

Department of 676, 89ITS

G-

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure A.!A / !#bit /L* ,$%$8 ?4&it ALU using ri++le carry adders /n )# bit /L* is formed by connecting eight !#bit /L*S in series. )#bit /L* using !- transistors and )# transistors are shown in figure A.!G.

Department of 676, 89ITS

G!

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure A.!G 6ight bit /L* using !- and ) transistor full adders /n eight bit /L* using ripple carry adders ta&es more propagation delay. The speed of /L* is limited by propagation of carry. To reduce the carry propagation the proposed design using carry select adder is implemented.

Department of 676, 89ITS

G>

Design of Low Power ALU using Area Efficient Carry Select Adder

CHAPTER . DES 63 O" ALU US 36 @OD " ED S*RT CSLA
.$1 ntroduction to different transistor ty+es
7ombinational logic forms the core of most digital integrated circuits such as fast arithmetic units and controllers. The design re0uirements imposed on the logic circuitry can vary widely. /rea is often the prime concern, as it has direct impact on cost. In many state#of#the#heart designs, speed tends to be the dominating re0uirement. 7ontemporary microprocessors are e%cellent e%amples of designs in this class. 5or other applications, minimi3ing the power consumption is crucial, as in the design of portable applications such as mobile telephones. These different design re0uirements generally translate into the use of different circuit styles, or even different manufacturing technologies. The static 78:S has e%cellent properties in many areasE low sensitivity to noise and process variations, e%cellent speed, and low power consumption. 8ost of those properties are carried over to more static 78:S gates such as C/CD gates with three or more inputs become large and slow. :ther design styles li&e complementary, the ratioed and the pass transistor logic styles have been devised to address this issue, all of which belong to the class of static circuits. .$1$1 Co5+le5entary C@OS / static 78:S gate is a combination of two networ&s, called the pull#up networ& +;*C, and the pull#down networ& +;DC,. The ;*C consists solely of ;8:S transistors and provides a conditional connection to V dd. The ;DC potentially connects the output to Vss and contains only C8:S devices. The ;*C and ;DC networ&s should be designed so that, whatever the value of the inputs, one and only one of the networ&s is conducting in steady state. In this way, a path always e%ists between Vdd and the output, reali3ing a high output +one, or alternatively, between Vss and output for a low output +3ero,.

Department of 676, 89ITS

GA

Design of Low Power ALU using Area Efficient Carry Select Adder

Pro+erties of co5+le5entary C@OS 7omplementary 78:S gates inherit all the nice properties li&e high noise margin, no static power consumption, as there is never a direct path between V dd and Vss in steady state mode and comparable rise and fall times. The complementary gate is inverting +implementing functions such as C/CD, C:9 T DC:9,. Implementing a non inverting Boolean function +such as /CD, :9, D:9, in one stage is not possible and re0uires the addition of an e%tra inverter stage. .$1$% Pseudo 3@OS / grounded ;8:S device presents an even better load. This configuration which is called pseudo#C8:S because it resembles the depletion C8:S load, is superior to the other approach. 5irst of all, the ;8:S transistor does not e%perience anybody effect as its Vsb is constant and e0ual to -. Secondly, the ;8:S device is driven by a Vgs e0ual to RVdd, resulting in a higher load#current level for similarly si3ed devices.

5igure G.! ;seudo C8:S /n important disadvantage is that it consumes static power when the output is low, because a direct path e%ists between V dd and ground through the load and device drivers. The grounded ;8:S load is a good imitation of an ideal current#source load. 5or a certain circuit configurations, some simple modifications can further improve

Department of 676, 89ITS

GG

Design of Low Power ALU using Area Efficient Carry Select Adder

either the speed or the power consumption. The following approach allows to completely eliminating the static current. .$1$, Differential cascade (oltage switc/ logic ;DCVSL< Let us consider that the complement of each signal is always available. This re0uires each gate to generate both polarities of the output signal. Such a gate, called Differential 7ascade Voltage Switch Logic +D7VSL, is presented. The ;DC! T ;DC> are complementary, and implement the re0uired logic function and its inverse. /ssume now that, for a given set of inputs, ;DC! conducts while ;DC> does not. Code out is pulled down. This turns on the load transistor 8>, pulling up outH. This in turn cuts off load transistor 8!. The gate is clearly free of static current paths as only ;DC! T 8> are conducting.

5igure G.> D7VSL logic gate Basic ;rinciple

5igure G.A D:9#DC:9 gates

Department of 676, 89ITS

G$

Design of Low Power ALU using Area Efficient Carry Select Adder

The availability of complementary signals eliminates e%tra inverter stages. /n e%ample in the circuit implements a two input D:9 and DC:9 gate. The transistor connected to the /#inputs are shared between the two ;DCs. D7VSL has, for instance, been used for the implementation of fast error#correcting logic in memories. The D7VSL gate has the speed advantageJ the reduction of the parasitic capacitances at the output nodes produces a faster response. /t the same time the static power consumption is eliminated. This comes at the e%pense of e%tra area, as each gate re0uires two pull#down networ&s. .$1$. Pass transistor logic This is another promising approach to implement comple% logic by reali3ing it as a logical networ& of switches or pass transistors. The pass transistor approach has the advantage of being simple and fast. 7omple% 78:S combinational logic is implemented with a minimal number of transistors. This reduces the parasitic capacitances and results in fast circuits. The static and transient performance of such a structure strongly depends upon the availability of a high#0uality switch with low parasitic capacitance and resistance. /lthough the 8:S transistor in itself is a switch of reasonable performance, some deficiencies will become apparent. ;ass transistor logic networ&s are, therefore, often constructed from bidirectional transmission gates +pass gates,. These gates are composed of an C8:S transistor and a ;8:S device in a parallel arrangement. The pass transistor acts as a bidirectional switch controlled by the gate signal 7. 'hen 7<!, both 8:S56Ts are on, allowing the signal to pass through the gate i.e., /<B if 7<!. :n the other hand, 7<- places both transistors in cutoff, creating an open circuit between nodes / and B.

5igure G.G ;ass transistor logic

Department of 676, 89ITS

G.

Design of Low Power ALU using Area Efficient Carry Select Adder

/lthough the pass transistor possesses some e%cellent properties, such as an almost constant resistance and no threshold loss, it has the disadvantage that it re0uires both an C8:S and a ;8:S transistor, which have to be located in different wells. This reduces the layout efficiency of the design. /lso, the control signal has to be presented in both the polarities, which once again has a negative influence on the layout density. 5urthermore, the parallel connection of ;8:S and C8:S results in increased node capacitances and reduced performance. It would therefore be advantageous if we could implement transmission gate using C8:S transistor only. *nfortunately, C8:S only pass transistors are subKect to voltage loss. This is not a problem if the voltage levels are subse0uently restored by a complementary 78:S inverter. Such a circuit suffers from two maKor drawbac&sE reduced noise margin, due to threshold voltage drop and static power consumption. Several techni0ues have been proposed to get around this problem. .$1$8 Trans5ission 6ate logic Transmission gate logic includes at least two field#effect transistor elements used as pass transistors, each having a channel of conductivity type opposite that of the other +i.e., complementary 56THs,. Transmission gate is switching element which switches the input to the output according to the gate input. Transmission gate is parallel connection of n#transistor, which is good at pass logic one and p#transistor which is good at pass logic 3ero. The basic arrangement of transmission gate is shown in figure G.$.

5igure G.$ / simple Transmission gate

Department of 676, 89ITS

G4

Design of Low Power ALU using Area Efficient Carry Select Adder

.$%$ S+ecial Hardware using @ulti+le)ers ;SH@<
Though B67 techni0ue reduces area and power but not up to considerable amount and also the design is not suitable for sub threshold level modifications. The !.#bit S@9T 7SL/ using B67 in its second level re0uires 4(> transistors. There is a scope to reduce the number of transistors along with the area reduction and power dissipation reduction by using proposed logic. 5or the implementation of a !.#bit S@9T 7SL/, 4A. transistors are re0uired by using proposed logic. The proposed logic implementation for second level 97/ is Special 2ardware using 8ultiple%ers +S28, as shown in figure G... In this the inputs are applied to first level 97/. /nd the output of 97/ is applied to second level S28 and then to third level multiple%er. Third level multiple%er selects either 97/ output or S28 output according to the previous carry. / simple A#bit S28 re0uires A multiple%ers to implement. b-, b!, b> are the inputs to the A#bit S28 and the %-, %!, %> are corresponding outputs. S28 will ta&e first level 97/ output as input and appends its value by one. A#bit S28 uses three multiple%ers and three inverters. 5irst inverter gives the first output bit %- basing on input bit b- and that output will be used as select line for the first multiple%er. 5irst multiple%er passes either second bit b! or inversion of second bit b!to the output because first inverter output acts li&e a carry to the second bit. 5irst multiple%er gives the second output bit %! and that will be used as second multiple%er select line. Basing on %! output bit and b! bit second multiple%er generates carry for input bit b>. :ne input to the second multiple%er is b! and second input is grounded which will be selected when it is connected as select line to the third multiplexer. Third multiple%er passes third bit or inversion of third bit to the output according to the previous carry bit. This logic can be e%tended to any number of bits. It is implemented for second bloc& with two inputs under consideration. 'hen number of inputs is increased the proposed techni0ue produces more efficient results on large scale. :ne point to be noticed is despite of the above advantages, delay is increased as carry has to pass >+n#!, levels in n bit S28 in order to appear at the output. The comparison between numbers of transistors is shown in table G.!.

Department of 676, 89ITS

G)

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure G.. / A#bit S28

Do< bo D!<%-.b!B%-.b! D><+%!Bb!,.b>B%!.b!.b>

Table G.! /rea comparison between >#bit 97/ and B67 Type of logic A#bit B67 > RD:9 !#/CD !#C:T A#bit S28 A#8*D A#C:T 1ates Cumber of transistors >G . > !) . >G A> Total number of transistors

Department of 676, 89ITS

G(

Design of Low Power ALU using Area Efficient Carry Select Adder

.$%$1 Transistor le(el design of SH@ / A#bit S28 ta&es >G transistors it is shown in figureG.4, corresponding functional verification in the figure and corresponding wave forms are shown in figure G.) and wave forms and power dissipation window shown in figure G.(.

5igure G.4 Transistor level A#bit S28

Department of 676, 89ITS

$-

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure G.) 7ritical path details of a A#bit S28

5igure G.( ;ower dissipation of a A#bit S28 The power and area between e%isting techni0ue such as B67 and proposed techni0ue such as S28 are compared in table G.>.

Department of 676, 89ITS

$!

Design of Low Power ALU using Area Efficient Carry Select Adder

Table G.> ;ower and delay 7omparison between >#bit 97/ and B67 Logic for Second Level B67 using 78:S S28 using 78:S A> Cumber of transistors 7ritical path delay +ns, !.>-4)! /rea +Um>, ;ower dissipation +Uw, static A.>.( Dynamic >$.4G. Total >(.-!$

>G

>.A$-

G).

A.!--

>>.)GA

>$.(GA

.$, An ?4&it ALU using +ro+osed carry select adder
The proposed techni0ue with !-#transistor full adder is applied to )#bit /L* and corresponding circuit diagram shown in figure circuit diagram shown in figure G.!-. and for )# transistor full adder,

Department of 676, 89ITS

$>

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure G.!- 6ight bit /L* using modified S@9T 7SL/

.$. !a(e for5s
By applying the >- ns cloc& to the every input output wave forms are obtained. The proposed techni0ue with !-#transistor full adder is applied to )#bit /L* and corresponding output wave forms and power dissipation is shown in figure G.!! and for )# transistor full adder, wave forms are shown in figure G.!>.

Department of 676, 89ITS

$A

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure G.!! 'ave forms of )# bit /L* for !-# transistor full adder

5igure G.!> 'ave forms of )# bit /L* for )# transistor full adder

Department of 676, 89ITS

$G

Design of Low Power ALU using Area Efficient Carry Select Adder

CHAPTER 8 RESULTS
8$1 Co5+arati(e analysis of e)isting CSLA and 5odified CSLA
In the designing of ) bit /L* using efficient carry select adder, all the bloc&s of !.#bit S@9T 7SL/, second level of second bloc& such as A#bit B67 and A#bit S28 are implemented in Dsch>..c R Logic 6ditor and synthesi3ed in 8icro wind >..a# Layout 6ditor under -.!>um technology with !.> volts as logic high voltage. The first level of second bloc& in the !.#bit S@9T 7SL/ is two bit 97/ which re0uires $. transistors when implemented in 78:S logic. The second level of second bloc& is A#bit S28 in the proposed logic designJ it uses >G transistors. The third level of second bloc& is multiple%er. / simple >%! multiple%er uses si% transistors 78:S technology. Bloc&> needs three >%! multiple%ers hence eighteen transistors are re0uired for the implementation. The total number of transistors re0uired for the complete bloc& > is only () when S28 is used. :therwise it re0uires !-. Transistors with B67 techni0ue. The number of transistors re0uired for bloc&A is only !G., for bloc&G are !(G and for bloc&$ are >G> when S28 is used. :therwise bloc&A re0uires !$), bloc&G re0uires>!- and bloc&$ re0uires >.> Transistors with B67 techni0ue. *sing S28 for the implementation of a !. bit S@9L 7SL/ 4A. transistors are re0uired where it re0uires 4(> transistors with B67 techni0ue. 5inally the complete second bloc& of!.#bit S@9T 7SL/ with B67 and S28 is implemented using 78:S technology and observed the results and are shown from Table $.!.

8$% Co5+arati(e analysis of e)isting ALU and 5odified ALU
/ll the basic gates in the /L* such as /CD, D:9, multiple%er and full adder are designed using 1DI techni0ue. 2ere full adder is designed using !- transistors as well as ) transistors. 5inal comparison on ) bit /L* is considered by ta&ing ripple carry adder and carry select adder.

Department of 676, 89ITS

$$

Design of Low Power ALU using Area Efficient Carry Select Adder

Design of )#bit /L* using efficient carry select adder is speed advantageous than the )#bit /L* using ripple carry adders. /L* using efficient carry select adder gives G>P advantage for !- transistors adder and G.P advantage for ) transistor adder. 7orresponding results are shown in table $.A and $.G. Table $.! 7omparison of second level ># bit 97/J A#bit B67 andA#bit S28 implemented using 78:S technology Logic for Second Level 97/ using 78:S B67 using 78:S S28 using 78:S $. Cumber of transistors 7ritical path delay +ns, !.(-!AG> /rea +Um>, Static ..4-. ;ower dissipation +Uw, Dynamic G>.$.$ Total G(.>4!

A>

!.>--

4)!

A.>.(

>$.4G.

>(.-!$

>G

>.A$-

G).

A.!--

>>.)GA

>$.(GA

Table $.> 7omparison between second bloc& with B67 and second bloc& with S28 using 78:S Design Type Cumber of transistors 7ritical path delay +ns, 97/#B67# 8*D 97/#S28# 8*D !-. () A.>GA.44AG.$ >((. /rea +Um>, Static >!.--$ >-.!A) ;ower dissipation +Uw, Dynamic !-. ()..>G Total !>4.--$ !!).4.>

Department of 676, 89ITS

$.

Design of Low Power ALU using Area Efficient Carry Select Adder

Table $.A 7omparison of )#bit /L* using !- transistor adder

8:D6L+/L*, )BIT /L* *SIC1 !T9/CSIST:9 97/ )BIT /L* *SIC1 !T9/CSIST:9 7SL/

C*8B69 :5 T9/CSIST:9S GG)

7ritical path delay+ns, A.!($

/rea+Um, !>A)G

;ower+mw, -.>-G

$-)

!.).$

>G.)>

-.>-$

Table $.G 7omparison of )#bit /L* using ) transistor adder 8:D6L+/L*, )BIT /L* *SIC1 ) T9/CSIST:9 97/ )BIT /L* *SIC1 ) T9/CSIST:9 7SL/ G(G >.-4>-()) -.>.> C*8B69 :5 T9/CSIST:9S GA> 7ritical path delay+ns, A.4G$ /rea+Um, !!)A> ;ower+mw, -.>>!

CHAPTER =

Department of 676, 89ITS

$4

Design of Low Power ALU using Area Efficient Carry Select Adder

CO3CLUS O3S A3D "UTURE SCOPE
=$1 Conclusions
In the process of designing a low power /L*, various tradeoffs between area, delay and power dissipation occurred. /s the adder is the main bloc& in the /L*, always efficient adder is preferred. 5or that, S@9T carry select adder is moderated with more power and area advantageous. In this process all second level 97/ bloc&s of !.#bit S@9T 7SL/ are replaced by S28 and the results are compared with e%isting techni0ue such as B67. 5rom the comparisons in Table $.!, it is observed that the variation between >#bit 97/ and proposed techni0ue A#bit S28 are more comparable such as percentage of utili3ation of number of transistors is reduced to $4.!P, correspondingly percentage of area re0uired also reduced to .A.4P along with power dissipation reduction advantage of G4.AP. 'hereas the variation between >#bit 97/ and e%isting techni0ue A#bit B67 is only G>.)P reduction of utili3ation of number of transistors, G!.)P reduction of area re0uired along with the G!.!P reduction of power dissipation. 5inally second bloc& of !.#bit S@9T 7SL/ is designed using logic level modification such as S28 in place of B67. 5rom the table $.>, it is observed that number of transistors is reduced by 4.$P, area is reduced by !A.$P and power is reduced by ..GP, but critical path delay is increased by !..AP. :nce again it is proved that the tradeoff between area, power and delay, the design is optimi3ed for power and area against to the delay over head. This delay overhead also can be overcome by using various e%isting low power circuit level modifications. By using the proposed efficient carry select adder and 1DI techni0ue )#bit /L* is designed for both !- transistor and ) transistor full adders and compared with the e%isting techni0ue such as )#bit /L* using ripple carry adders in the tables $.A and $.G. It is observed that speed is increased G!..P in case !- transistor full adder and GG.4P in case of )#transistor full adder. The performance of the proposed design has been shown to outperform. Satisfactory level of power consumption and propagation delay can be achieved using

Department of 676, 89ITS

$)

Design of Low Power ALU using Area Efficient Carry Select Adder

the proposed technology without the need to purchase new technology libraries, which may lead to design cost reduction. 7onse0uently, the proposed design is suitable for the application in the high#performance arithmetic and VLSI circuits in the future.

=$% "uture Sco+e
The proposed wor& can be e%tended and carried further with an aim of increasing the number of bits and approach to new technology such as -.-), -.-. micron meter technology. The resulting design with few numbers of transistors will in turn result in reduction of total area and also reduction in the power consumption.

RE"ERE3CES

Department of 676, 89ITS

$(

Design of Low Power ALU using Area Efficient Carry Select Adder

!" /run ;ra&ash Singh, 9ohit Mumar, OImplementation of !#bit 5ull /dder *sing 1ate Diffusion Input +1DI, cellV, International Wournal of 6lectronics and 7omputer Science 6ngineering W. 7ler& 8a%well, / Treatise on 6lectricity and 8agnetism, Ard ed., vol. >. :%fordE 7larendon, !)(>, pp..)#4A. >" C. 8. 7hore, 9. C. 8andavgane , O / survay of low power high speed one bit full adderV,recent advances in networ&ing, VLSI and signal processing, ISSCE !4(-#$!!4. ISBCE (4)#(.-#G4G#!.>#$. A" C. 'este and M. 6shraghian, ;rinciples of 78:S VLSI Design, / System ;erspective . 9eading, 8/E /ddison# 'esley, !((A. G" ;ardeep Mumar = International Wournal of 6ngineering 9esearch and /pplications+IW69/, ISSCE >>G)#(.>> Vol. >, Issue ., Covember# December >-!>, pp.$((#.-. $" 8.sreedevi and p.Keno.paul O Design and :ptimi3ation of a 2igh ;erformance Low#;ower 78:S 5le% 7ell O, International Wournal of Signal System 7ontrol and 6ngineering /pplication, >-!-, vol.A, no.G, pp..$#.(. D:IE !-.A(>A=iKssceapp.>-!-..$..(. ." / good over view of lea&age and reduction methods are e%plained in the boo& Lea&age and reduction in Canometer 78:S Technologies ISBC -#A)4#>$4A4#A. 4" 8.;arvathi, C.Vasantha, M. Satya ;rasad ODesign of 2igh Speed #Low ;ower# 2igh /ccurate +2S#L;#2/, /dder O, I767T, Internation conference on 6lectronics 7omputer Technology ;roceedings, >-!>, ppE $>A#$>4, (4)#!#G.4A# !)$-#!=!>X>-!>, I666. )" M /llipeera, S /hmed Basha, O/n 6fficient .G#Bit 7arry Select /dder 'ith Less Delay /nd 9educed /rea /pplicationO, International Wournal of 6ngineering 9esearch and /pplications+ IW69/, .ISSCE >>G)#(.>> www.iKera.com Vol. >, Issue $, September# :ctober >-!>, pp.$$-#$$G (" :.W.BedriK, O7arry Select /dderV, I96 Trans. 6lectron. 7omput.pp. AG-#AGG,!(.>. !-" *.Sreenivasulu, T.Ven&ata Sridhar, OImplementation of /n G Bit # /L* *sing

Low#;ower /nd /rea#6fficient 7arry Select /dderV, International 7onference on 6lectronics and 7ommunication 6ngineering, >-th, 8ay >-!>, Bangalore, ISBCE (4)#(A#)!.(A#>(#>.

Department of 676, 89ITS

.-

Design of Low Power ALU using Area Efficient Carry Select Adder

!!"

/./ndamuthu, S.9ithanyaa, VDesign :f !>) Bit Low ;ower and /rea

6fficient 7arry Select /dderV, International Wournal of /dvanced 9esearch in 6ngineering +IW/96, Vol !, Issue !,>-!> ;age A!#AG. !>" B.9am&umar, 2.8.Mittur, and ; .8.Mannan, O/SI7 implementation of

modified faster carry save adderV, 6*9 .W. Sci .9es. vol.G>, no.!, pp.$A#$), >-!-. !A" T.S.7eaing and 8.W.2saio, Ocarry Rselect adder using single ripple carry

adderV, 6lectron. Lett. Vol.AG,no.>>,pp.>!-!#>!-A, oct.!(() !G" S.Mim and L.S.Mim, O.G#bit carry select adder with reduced areaV, 6lectron.

Lett. Vol.A4,no.!-,pp..!G#.!$, 8ay.>--!. !$" B 9amMumar and 2arish 8 Mittur, OLow R;ower /nd /rea #6fficient 7arry

Select /dderV, I666 Transactions on Very Large Scale Integration+VLSI,Systems

Department of 676, 89ITS

.!

Design of Low Power ALU using Area Efficient Carry Select Adder

APPE3D A
A&out @icrowind%
The 8I79:'ICD> program allows the student to design and simulate an integrated circuit at physical description level. The pac&age contains a library of common logic and analog I7s to view and simulate. 8I79:'ICD> includes all the commands for a mas& editor as well as original tools never gathered before in a single module +>D and AD process view, Verilog compiler, tutorial on 8:S devices,. Sou can gain access to 7ircuit Simulation by pressing one single &ey. The electric e%traction of your circuit is automatically performed and the analog simulator produces voltage and current curves immediately. This includes details on the device modeling, simulation at logic and layout levels.

5igure /E 8I79:'ICD window as it appears at the initiali3ation stage. 'e use 8I79:'ICD> to draw the 8:S layout and simulate its behavior. 1o to the directory in which the software has been copied +By default microwind>,.

Department of 676, 89ITS

.>

Design of Low Power ALU using Area Efficient Carry Select Adder

Double#clic& on the 8icro'indA icon. The 8I79:'ICD> display window includes four main windowsE the main menu, the layout display window, the icon menu and the layer palette. The layout window features a gridJ scaled in lambda +Y, units. The lambda unit is fi%ed to half of the minimum available lithography of the technology. The default technology is a 78:S .#metal layers -.!>Zm technology, conse0uently lambda is -.-.Zm +.-nm,.

Si5ulation of a layout
8I79:'ICDA includes a AD process viewer for that purpose. 7lic& Si5ulate [ Process ste+s in ,D. The simulation of the 78:S fabrication process is performed, step#by#step by a clic& on 3e)t Ste+. The picture on the left represents the n8:S device, p8:S device, common polysilicon gate and contacts. The picture on the right represents the same portion of layout with the metal layers stac&ed on top of the active device.The inverter simulation is conducted as follows. 5irstly, a VDD supply source +!.>V, is fi%ed to the upper metal> supply line, and a VSS supply source +-.-V, is fi%ed to the lower metal> supply line. The properties are located in the palette menu. Simply clic& the desired property, and clic& on the desired location in the layout. /dd a cloc& on the inverter input node +The default node name clock1 has been changed into Vin, and a visible property on the output node Vout

The command Si5ulate [ Run Si5ulation gives access to the analog simulation. Select the simulation mode Voltage (s$ Ti5e. The analog simulation of the circuit is performed. The time domain waveform, proposed by default, details the

Department of 676, 89ITS

.A

Design of Low Power ALU using Area Efficient Carry Select Adder

evolution of the voltages in! and out! versus time. This mode is also called transient simulation The command si5ulateBCrun si5ulation gives access to four simulation modes.(oltage (s ti5eD (oltage and current (s ti5eD static (oltage (s (oltage and fre9uesncy (s ti5e$ all these simulation modes are applicable to inverter simulation. Due to the fact that the layout n( ste+s$ @s2 not only includes the inverter correctly polari3ed but also several other 8:S devices without any simulation properties, a warning window appears prior to the anolog simulation, in this case you may clic& si5ulate as itD In normal cases. /ll n#well regions should be stuc& at VDD. Select the simulation mode (oltage (s ti5e. The analog simulation of the circuit is performed. The time domain waveform. ;roposed by default, details the evolution of the voltages in! and out! versus time. This mode is also called transient simulation. The inverter consumes power during transitions, due to two separate effects. The first is short circuit power arising from momentary short circuit current that flows from VDD to VSS when the transistor functions in the complete on=off state. The second is charging=discharging power, which depends on the output wire capacitance. 'ith small loading the short circuit power loss is dominant. 'ith huge loading, that is a large output node capacitance, the load power is dominant. The power consumption occurs briefly during transitions of the output, either from - to ! or from ! to -.the simulation contains the supply currents in the upper window, and all voltage waveforms in the lower window. The current consumption is important only during a very short period corresponding to the charge or discharge of the output node. 'ithout any switching activity, the current almost e0uals 3ero. • Delay /s the number of gates connected to the inverter output mode increase, the load capacitance increases. The fan#out corresponds to the number of gates connected

Department of 676, 89ITS

.G

Design of Low Power ALU using Area Efficient Carry Select Adder

to the cell output. ;hysically a large fan#out means a large number of connections that is a large load capacitance. /n inverter circuit is simulated by using different cloc&, fanout and supply conditions. The initial configuration is based on one inverter controlled by a > 123 cloc&, with its output connected either to a single inverter or to four inverters. The supply voltage is !.>V, with a -.!>Zm 78:S technology. Cow we connect four inverter circuits to the output node, thus increasing the charge capacitance. In the simulation chronograms the inverter delay is significantly increased. 'hen we investigate the delay variation with the output capacitance load. In the curve we can see that the gate delay variation with the loading capacitance is 0uite linear. / !--f5 load leads to around A--ps delay in 78:S -.!>Zm technology. In 8icrowind we obtain this type of screen, than&s to the command +ara5etric analysis$ Load the file Invcapa.8SM, invo&e the command parametric analysis. By default the capacitance of output node is increased step#by#step from its default value 7def to 7def B!--f5.for each value of the output capacitance, the analog simulation is performed, and the last computed rise time is plotted, appearing as one single red dot in the graphs. The complete graph is built once all analog simulations have been compelted.The memory button enables us to store one curve prior to a new parametric simulation, for comparison purposes. Three main parameters may vary in the parametric analysisE the capacitance voltage, temperature. Several analog parameters may be monitoredE rise and fall delay, oscillating fre0uency, power consumption, final voltage of a node, cross tal& etc. Power consu5+tion The power consumption ; is computed by micro wind as the average product of the supply voltage VDD and the supply current IDD, computed at each iteration step# in other words ; < \ IDD.VDDEsteps

Department of 676, 89ITS

.$

Design of Low Power ALU using Area Efficient Carry Select Adder

Three main factors contribute to power consumption ;E the load capacitance 7, the supply voltage VDD and the cloc& fre0uency for a 78:S inverter, this relation is usually represented by the first order appro%imation below .The following e0uation shows a linear dependence of the power consumption ; with the total capacitance 7 and the operating fre0uency father power consumption is also proportional to the s0uare of the supply voltage VDD. ; < -.$].7.V>dd.f ] < switching activity factor. 7 < output load capacitance Vdd< supply voltage f< cloc& fre0uency. "re9uency de+endence 'e can verify the linear dependence of the power consumption with the operating fre0uency by simulating a 78:S inverter circuit. /t each time domain analog simulation, we get a value of the power consumption, which is computed by micro wind as the average product of the supply voltage VDD and the supply current IDD.as the power consumption is linearly proportional to the cloc& fre0uency, a usual metric found in most cell libraries is the Z'1h3. Su++ly (oltage de+endence It can be considered as a first order appro%imation that the average power consumption is proportional to VDD^>.we use the parametric analysis tool in micro wind to control the incremental change of the supply voltage from -.$ to >.- V.the supply voltage step is -.! V.in the measurement window, the item dissipation is selected. The result shows a non linear dependence of the power dissipation with VDD.the s0uare law fits with the e%perimental data form -.) to !.$ V.we notice a very important rise of the power consumption over !.$ V, due to the avalanche effects in n channel 8:S devices. The simulation demonstrates the interest for a minimum supply operation to achieve optimum low power operation.

Department of 676, 89ITS

..

Design of Low Power ALU using Area Efficient Carry Select Adder

@ini5u5 su++ly (oltage 'e must &now the supply voltage for which the inverter does not wor& any more and the answer is given by the parametric analysis focusing this time on the inverter delay dependence versus the supply voltage. Load the file c5osload$5s2 for this study. Invo&e the command +ara5etric analysis of the analysis menu. clic& the layout region corresponding to the node VDD. Verify that the voltage menu is selected in the parametric analysis window. Verify that the node VDD is selected. 8odify the VDD voltage range from -.$ to !.$ V, step -.!.finally in the measurement menu, select the item rise delay and clic& start analysis$ 'e observe that the delay is significantly increased as we decrease V DD from its nominal value !.>V down to -..V.below -.4V the inverter delay is higher than the default transient simulation time so that the delay evaluator does not wor& anymore. Static c/aracteristics The static characteristics of the inverter correspond to the variation plot of the output voltage versus the input voltage. The simulation involves a step by step increase of Vin, and the monitoring of Vout. In the simulation window, the static characteristics are obtained by a clic& on the item (oltage (ersus (oltage situated in the selection menu, at the bottom of the chronograms. 'hen Vin is low, Vout is high which corresponds to one logic state of the inverter. 'hen Vin increases Vout starts to decrease slowly, and suddenly crosses the VDD=> boundary. /t that point the value of Vin is the commutation point of the inverter called Vc.then when Vin rises to V DD, Vout reaches -.which corresponds to the other logic state of the inverter.

A&out DSCH,

Department of 676, 89ITS

.4

Design of Low Power ALU using Area Efficient Carry Select Adder

The DS72A program is a logic editor and simulator. DS72A is used to validate the architecture of the logic circuit before the microelectronics design is started. DS72A provides a user#friendly environment for hierarchical logic design, and fast simulation with delay analysis, which allows the design and validation of comple% logic structures. Some techni0ues for low power design are described in the manual. DS72A also features the symbols, models and assembly support for )-$!. DS72A also includes an interface to S;I76. "eatures

5igure BE DS72 schematic editor • • • user friendly environment for rapid design of logic circuits. Supports hierarchical logic design. 2andles both conventional pattern#based logic simulation and intuitive on# screen mouse#driven simulation.

Department of 676, 89ITS

.)

Design of Low Power ALU using Area Efficient Carry Select Adder

• • • • •

Built#in e%tractor, which generates a S;I76 net list from the schematic diagram +7ompatible with ;S;I76T8 and 'inSpiceT8,. 7urrent and power consumption analysis. 1enerates a V69IL:1 description of the schematic for layout editor. Immediate access to symbol properties +Delay, fan#out,. 8odels and supports )-$! micro controller

/n e%ample of the design of the schematic diagram in the DS72 and the generation of its layout in the 8I79:'ICD is shown. The 78:S inverter design is detailed in the figure 7 below. 5irst clic& new on main menu then draw the circuit diagram on DS72 window by dragging the components from symbol library. Draw the circuit diagram as shown below.

5igure 7E Inverter circuit

Save the file and 7lic& Si5ulateB Start si5ulation in the main menu. Then, clic& inside the buttons situated on the left part of the diagram. The result is displayed on the L6D. 2ere the p#channel 8:S and the n#channel 8:S transistors function as switches as shown in the figure D. 'hen the input signal is logic -as shown in figure $.G the C8:S is switched off while ;8:S passes V DD through the output. 'hen the

Department of 676, 89ITS

.(

Design of Low Power ALU using Area Efficient Carry Select Adder

input signal is logic ! shown in figure ..!>, the ;8:S is switched off while the C8:S passes VSS to the output.

5igure DE 7ircuit diagram of 78:S inverter, 78:S inverter 'hile simulation The fan#out corresponds to the number of gates connected to the inverter output. ;hysically, a large fan#out means a large number of connections that is a large load capacitance. If we simulate an inverter loaded with one single output, the switching delay is small. Cow, if we load the inverter by several outputs, the delay and the power consumption are increased. The power consumption linearly increases with the load capacitance. This is mainly due to the current needed to charge and discharge that capacitance. 7lic& the button Stop simulation shown in the figure below. Sou are bac& to the editor.

Department of 676, 89ITS

4-

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure 6E Timing diagram of inverter 7lic& the c/ronogra5 icon to get access to the chronograms of the previous simulation. /s seen in the waveform, the value of the output is the logic opposite of that of the input. 6eneration of layout of t/e sc/e5atic diagra5 Ce%t open the 8icrowind window and clic2 on o+en in the main menu. Then open 78:S inverter circuit diagram. Then clic2 on co5+ile t/e (erilog file to generate the verilog file of corresponding circuit diagram.It generates the corresponding stic& diagram of the inverter circuit as shown in the figure. Then clic& on si5ulate icon in main menu to generate the waveforms. Verilog +rogra5 == DS72 Ver A.-

Department of 676, 89ITS

4!

Design of Low Power ALU using Area Efficient Carry Select Adder

== 1E_proKect_dsch microwind_self_e%ample.sch module e%ample +in!, out!,J input in!J output out!J wire J pmos `+!4, pmosa!+out!,vdd,in!,J == >.-u -.!>u nmos `+!4, nmosa>+out!,vss,in!,J == !.-u -.!>u endmodule == Simulation parameters in Verilog 5ormat always `!--- in!<bin!J in! 7LM !Layout In this paragraph, the procedure to create manually the layout of a 78:S inverter is described. 7lic& the icon 8:S generator on the palette. The following window appears. By default the proposed length is the minimum length available in the technology +> lambda,, and the width is !- lambda. In -.!>Zm technology, where lambda is -.-.Zm, the corresponding si3e is -.!>Zm for the length and -..Zm for the width.. 7lic& on the top of the n8:S to fi% the p8:S device. The result is displayed in figure 5.

Department of 676, 89ITS

4>

Design of Low Power ALU using Area Efficient Carry Select Adder

5igure 5E Layout of inverter in 8I79:'ICD

5igure 1E Selecting the C8:S device

Department of 676, 89ITS

4A

Design of Low Power ALU using Area Efficient Carry Select Adder

Connection &etween de(ices 'ithin 78:S cells, metal and polysilicon are used as interconnects for signals. 8etal is a much better conductor than polysilicon. 7onse0uently, polysilicon is only used to interconnect gates, such as the bridge +!, between p8:S and n8:S gates, as described in the schematic diagram of figure 1. ;olysilicon is rarely used for long interconnects, e%cept if a huge resistance value is e%pected. In the layout shown in figure 1, the ;olysilicon Bridge lin&s the gate of the n#channel 8:S with the gate of the p#channel 8:S device. The polysilicon serves as the gate control and the bridge between 8:S gates.

5igure 2E 7onnections re0uired to build the inverter +7mosInv.S72,

Department of 676, 89ITS

4G

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close