POWER EFFICIENT ENHANCED MULTIPLIER FOR IMAGE PROCESSING APPLICATION

POWER EFFICIENT ENHANCED MULTIPLIER FOR IMAGE PROCESSING APPLICATION

R. P. Meenaakshi Sundhari<sup>1</sup>, K.Deenu<sup>2</sup>, B.Saranraj<sup>3</sup> K.Poornimadevi<sup>4</sup>

<sup>1</sup>Profesor, Department of ECE, P.A. College of Engineering and Technology, Pollachi, Coimbatore

Email: rpmeenaakshi@gmail.com

# <sup>2, 3, 4,</sup> Assistant Professor, Department of ECE

P.A. College of Engineering and Technology, Pollachi, Coimbatore

#### DOI: 10.48047/ecb/2023.12.5.254

**Abstract** - In recent digital frameworks, multiplication is taken into account and observed to be a coherent approach for trade off energy against reliability and performance. Fast arithmetic block designing is the fundamental factor for the computing systems with higher performance. Multiplication will be a frequently used computational operation that appears certainly in processing (i.e., signal) and technical applications. Multipliers are equipment concentrated and the original authentication of attention is high speed, low cost and less area in VLSI. By comparing multipliers, dadda multiplier has a specified steps to reduce the parameters, which having three multiplication steps for partial product reduction. The compressor for dadda multiplier has been blocked out using two different new approximate 4-2 compressor outline. Image multiplication is broadening in real time applications. By use of these multiplier the implementation of image multiplication has been put forward with the help of Xilinx and MATLAB.

# Key words: dadda multiplier, image processing, MATLAB, Xilinx, VLSI. I. INTRODUCTION

**LINTRODUCTION** Adders and multipliers have been plays a vital role in the designing of computer arithmetic circuits. Multipliers consume low power, occupy small area and will take less operating time. The performance of multipliers helps to determination of processor's speed and performance of many DSP algorithms. Due to these reasons, the designers are specializing on multipliers of high speed with low power delay product. Delay, area and power consumption have become the Para mount concerns in designing of computing systems. By Moore's law speed and dense of transistor in integrated circuits have increased exponentially. The concept is accepted that this exponential increment trend will be end because it is not clear exactly how dense and fast IC'S will reached this point. Power has become the most important concern in designing, not only for the functionality. On a chip, but also for density and computing power of integrated circuits. With the increased need of maximum speed and minimum power VLSI devices, there is a non-stop urge for rapid speed multipliers. Minimal power consumption is additionally a crucial concern in design of multiplier. To scale back significant power consumption to lessen the number of functions through decreasing dynamic power it will be main factor of total power usage therefore the requirement of maximum velocity and minimum energy(watts) products has been demanded. Developers mostly view only in maximum speed and minimum energy(watts) well defined circuit and expensive. Figure products arises into 2 types. A primary type consumption block. A figure is an arrangement of picture elements, which has zeros and ones. Processing an image is faster and expensive. Figure products arises into 2 types. A primary type considers 2 inlet figures then produces a result during the byte numbers are combination of primary figure. The secondary type acquires one inlet figure and yield result at when every byte value is increased by a particular value. Therefore, the final result is

## **II. PREVIOUS WORK**

Some authors presented the evaluation of array multiplier which is 32 bits with carry save adder and CLA adder. Momeni et al, the established design achieves minimum power dissipation, number of transistor count and delay. The latency of the decreased circuit of a Dadda multiplier relays in account of the minimized levels then latency levels. Ravi et al, presented a decreased area FA that was a main element of the architecture. It utilizes minimum count of gates comparing to traditional architecture minimum count of gates comparing to traditional architecture which has minimal latency and region. Radhakrishnan, proposed the Minimal -energy(watt) Complementary MOS pass logic 4-2 compressor for maximum velocity multiplier.

## **III. PROPOSED WORK**

An altered design of 4\*4multiplier is presented in the proposed research. The improved complete adder, which requires 12

transistors, is used in this multiplier circuit. The complete adder's transistors, is used in this multiplier circuit. The complete adder's design is hybrid, meaning it combines 2 methods: CMOS manufacturing technique and pass transistor logic. Because the circuit's construction blocks have minimal propagation delay and power delay product, a FA and a HA are utilized. In comparison with other previously built full adders, this full adder has a low power and latency. As a result, multipliers with minimal propagation delays, low power dissipation, and low CPU time are developed.4:2 compressors often utilized to minimise latency of the fractional multiplication additional levels of similar products. the fractional multiplication additional levels of similar products. In image processing application the dadda multiplier is used.

Section A-Research paper

#### A. DADDA MULTIPLIER

The presented 4X4 multiplier have a sum of 16 terms, therefore the tree's peak is four, as shown in Figure 1. The Dadda technique is utilized to minimize the tree's peak to 2. If Dadda method is not employed, we must mimic the prior stage since the subsequent stage uses the previous stage's carry, which can cause the propagation time to grow. We must utilise the ripple carry adder for this simple technique; it utilizes low energy with substantially longer latency. A Dadda technique is utilized to minimize the general propagation latency.

Because it does not rely on the previous stage at the start, this is the simplest way for reducing the latency of general architecture. As a result, first stage will be implemented independently of the others. First, as presented in Figure. 1, we ordered the partial products which form a structure. The AND gate generates these fractional multiplication and the top of this tree is 4 because the presented device.

#### **B. LEVELS OF DADDA TECHNIQUES**

A goal is to minimize the tree's apex from 4 to 2 levels. As a result, after completing the first Dadda level, building blocks are utilized to minimize the tree length from 4 to 3, then from three to two after completing the 2nd Dadda level. The 2 Dadda phases is employed. Figures 2, 3 depict the first and second Dadda phases, respectively.



Figure1: 4\*4 Multiplication



Figure2: Primary Dadda level



Figure3: Secondary Dadda level



FC<sub>5</sub> HC<sub>3</sub> FC<sub>4</sub> HC<sub>2</sub>

Figure4: Final Dadda level

The 4X4 tree minimization, levels are as follows

- Primary level length of 4
- Secondary level-length of 3
- The final level length of 2

### C. ALGORITHM IMPLEMENTATION

On this tree, the Dadda method is employed to scale back the tree's peak. In the

tirst level, the tree has a height of four, which we must lower utilizing FA and HA. As result, it had used a FA in the fourth column to lower its length, as well as a HA second column and 2 additional FA on the 3<sup>rd</sup> and 5 th queue which bring the tree length down to three. These HA and FA are now performing in tandem, with nun relying on prior step. Figures 3<sup>rd</sup> & 4<sup>th</sup> depict largest tree minimization, with a ripple carry adder being utilized in the last level.

# **IV. BLOCK DAIGRAM**



Section A-Research paper

# The working levels of the suggested techniques as below:

Proposed model's working stages are as follows:

1. The AND gate is used to generate 16 partial products in the first step. 2. Next, the tree's length is minimized in 1 HA and 3 FA in Dadda

stage. 3. Next, two HA and two FA were utilized to minimize the

number.

4. We utilized a ripple carry adder in final stage of the Dadda.

5. Finally, output voltages are improved by passing the results through the buffer (smooth).

## V. 4:2 COMPRESSOR

The circuit diagram of a 4:2 compressor can be seen below.



Figure6: Block representation of 4X2 compressor

The 4x2 and 5x2 compressors are widely utilized to shorten latency of partial product summation stage of simultaneous multipliers. Most compressor construction that weaked to maximise 1 or more characteristics. This research focuses on approximate 4:2 compressors. First, a brief history of the 4x2 compressors are described. The 2 continuously combined FA make up the internal construction of an exact 4:2 compressor. The weight of entire inlet and t entire result is similar in this structure, but the cross of carry and Cout results are 1 bit greater.



Figure 7: Stimulation result of compressor

The outputs sum, carry, and Cout are derived from Sum =  $x1 \oplus x2 \oplus x3 \oplus x4 \oplus C$  in carry= $(x1 \oplus x2 \oplus x3 \oplus x4)C$  in+ $(x1 \oplus x2 \oplus x3 \oplus x4)x4$ Cout =  $(x1 \oplus x2)x3 + (x1 \oplus x2)x1$ 

## POWER EFFICIENT ENHANCED MULTIPLIER FOR IMAGE PROCESSING APPLICATION

| Cin                                   | X4 | X3 | X2 | X1 | Cout | Carry | Sum |
|---------------------------------------|----|----|----|----|------|-------|-----|
| 0                                     | 0  | 0  | 0  | 0  | 0    | 0     | 0   |
| 0                                     | 0  | 0  | 0  | 1  | 0    | 0     | 1   |
| 0                                     | 0  | 0  | 1  | 0  | 0    | 0     | 1   |
| 0                                     | 0  | 0  | 1  | 1  | 1    | 0     | 0   |
| 0                                     | 0  | 1  | 0  | 0  | 0    | 0     | 1   |
| 0                                     | 0  | 1  | 0  | 1  | 1    | 0     | 0   |
| 0                                     | 0  | 1  | 1  | 0  | 1    | 0     | 0   |
| 0                                     | 0  | 1  | 1  | 1  | 1    | 0     | 1   |
| 0                                     | 1  | 0  | 0  | 0  | 0    | 0     | 1   |
| 0                                     | 1  | 0  | 0  | 1  | 0    | 1     | 0   |
| 0                                     | 1  | 0  | 1  | 0  | 0    | 1     | 0   |
| 0                                     | 1  | 0  | 1  | 1  | 1    | 0     | 1   |
| 0                                     | 1  | 1  | 0  | 0  | 0    | 1     | 0   |
| 0                                     | 1  | 1  | 0  | 1  | 1    | 0     | 1   |
| 0                                     | 1  | 1  | 1  | 0  | 1    | 0     | 1   |
| 0                                     | 1  | 1  | 1  | 1  | 1    | 1     | 0   |
| 1                                     | 0  | 0  | 0  | 0  | 0    | 0     | 1   |
| 1                                     | 0  | 0  | 0  | 1  | 0    | 1     | 0   |
| 1                                     | 0  | 0  | 1  | 0  | 0    | 1     | 0   |
| 1                                     | 0  | 0  | 1  | 1  | 1    | 0     | 1   |
| 1                                     | 0  | 1  | 0  | 0  | 0    | 1     | 0   |
| 1                                     | 0  | 1  | 0  | 1  | 1    | 0     | 1   |
| 1                                     | 0  | 1  | 1  | 0  | 1    | 0     | 1   |
| 1                                     | 0  | 1  | 1  | 1  | 1    | 1     | 0   |
| 1                                     | 1  | 0  | 0  | 0  | 0    | 1     | 0   |
| 1                                     | 1  | 0  | 0  | 1  | 0    | 1     | 1   |
| 1                                     | 1  | 0  | 1  | 0  | 0    | 1     | 1   |
| 1                                     | 1  | 0  | 1  | 1  | 1    | 1     | 0   |
| 1                                     | 1  | 1  | 0  | 0  | 0    | 1     | 1   |
| 1                                     | 1  | 1  | 0  | 1  | 1    | 1     | 0   |
| 1                                     | 1  | 1  | 1  | 0  | 1    | 1     | 0   |
| 1                                     | 1  | 1  | 1  | 1  | 1    | 1     | 1   |
| Table1. Truth table of 4:2 compressor |    |    |    |    |      |       |     |

#### **VI. EXPIRIMENTAL OUTPUTS**

## A. EXPIRIMENTAL STRUCTURE

The suggested structure is assessed in respect of critical path latency, CPU time, and power consumption in the included portion. A conventional Dadda multiplier and an array multiplier were used to compare the proximate multiplier's abilities to defend power, cut down critical path latency, and reduce CPU time.

Array multiplier along with Dadda multiplier, were 4\*4 bits and encrypted by implementing Hardware Description Language. The Power consumption, Delay and CPU time was obtained using Xilinx software.



Figure8: Stimulation result of dadda multiplier

#### **B. POWER CONSUMPTION, CPU TIME AND DELAY**

Comparisons of power consumption, CPU Time and Delay for Array multiplier and Dadda multiplier are shown in Figure9 to Figure11 respectively. From figure9 the power consumption for array multiplier has reduced to  $2.47\mu W$  when compared Dadda

#### Section A-Research paper

multiplier. Figure 10 represents the CPU Time in which the proposed method has reduced the CPU Time to 18.02sec when compared to the Array multiplier. Figure 11 displays the parallel output of Delay for Array multiplier & Dadda multiplier. That is noted that Dadda multiplier has minimize latency to 8.255ns when correlate with Array multiplier.











Table 2: Observation of Array multiplier and Dadda multiplier

| Parameters                   | Existing<br>method | Proposed<br>method |  |
|------------------------------|--------------------|--------------------|--|
| Power<br>consumption<br>(µW) | 7.53               | 5.06               |  |
| CPU Time(sec)                | 19.10              | 1.08               |  |
| Delay(nS)                    | 20.829             | 12.637             |  |

# VII. IMAGE PROCESSING APPLICATION

First analyze inlet image, the image is inclined as input in MATLAB; image Is converted in to RGB to GRAY scale image in the image is resized. The pixel values are removed from changed image, means that useless information was removed from the changed image which is called as redesign image; the redesign figure is digitized in which input image was converted to the text/byte's values.

Figure 12 shows the block diagram of dadda multiplier for image processing application.



Figure12:Block diagram of Image processing application

# VIII.CONCLUSION

The proposed methodology, 4X4 bit approximate multiplier is planned by utilizing 4-2 compressors approximate HA and FA. Simulation result has been implemented and reported. These approximate compressors are planned using FA and finally concluded that the design of dadda multiplier using full adders have significant reduction in parameters such as power consumption, delay, CPU time. Image multiplication is a growing factor in real time applications. By using this dadda multiplier the application of image multiplication is proposed with the help of Xilinx, ModelSim and Matlab.

# Section A-Research paper

# IX. REFERENCES

1. Zain Shabbir, Anas Razzaq Ghumman, Shabbir Majeed Chaudhry, A Reduced-sp-D3Lsum Adder-Based High Frequency  $4 \times 4$  Bit Multiplier Using Dadda Algorithm, Springer Science Business Media New York 2015.

2. Design of high-speed carry saves adder using carry lookahead adder. Available from: https://www.researchgate.net/publication/301407573\_Design\_of\_ high\_speed\_carry\_save\_adder\_using\_carry\_lookahead\_addr [accessed Sep 22, 2017].

3. S. Z. Naqvi, S. Z. Hassan and T. Kamal, "A power consumptionand area improved design of IIR decimation filters via MDT,"2016 International Conference on Intelligent SystemsEngineering (ICISE), Islamabad, 2016, pp. 146-151.doi: 10.1109/INTELSE.2016.7475111

4. A. Mukhtar, H. Jamal and U. Farooq, "An area efficient interpolation filter for digital audio applications," in IEEE Transactions on Consumer Electronics, vol. 55, no. 2, pp. 768-772, May 2009.doi: 10.1109/TCE.2009.5174452

5. Stephen P. Boyd, Seung-Jean Kim, Dinesh D. Patil, Mark A.Horowitz, Digital Circuit Optimization via Geometric Programming, Operations Research, v.53 n.6, p.899-932,November-December 2005 [doi>10.1287/opre.1050.0254]

6. P. Prem Kumar, K. Duraiswamy, and A. Jose Anand, "Anoptimized device sizing of analog circuits using genetic algorithm, "European Journal of Scientific Research, vol.69, no.3, pp.441–448, 2012.

7. Revna Acar Vural, Member, IEEE, Burcu Erkmen, Member,IEEE, Ufuk Bozkurt, Tulay Yildirim, Member, IEEE, "CMOSDifferential Amplifier Area Optimization with EvolutionaryAlgorithms," Proceedings of the World Congress on Engineeringand Computer Science 2013 Vol II WCECS 2013, 23-25 October,2013, San Francisco, USA.

8. Jorge Juan Chico, Enrico Maci, book chapter 17, "Power-Consumption Reduction in Asynchronous Circuits Using DelayPath Unequalization, page (151-160)", 13th InternationalWorkshop, PATMOS 2003 Turin, Italy, September 10-12, 2003, Proceedings.

9. Reminder Preet Pal Singh, Praveen Kumar, Bal winder Singh, "Performance Analysis of 32-bit Array Multiplier withCarrying Save Adder and with Carry Look ahead adder", inInternational Journal of Recent Trends in Engineering, Vol2, No. 6, November 2009.