

# A HIGH-SPEED LOW-COMPLEXITY MODIFIED RADIX – 2<sup>5</sup> FFT PROCESSOR FOR HIGH RATE UWB APPLICATIONS

NIZAMUDHIN R<sup>1</sup>, V.SARADA<sup>2</sup>

<sup>1</sup>PG Student, <sup>2</sup>Assistant Professor, <sup>1,2</sup>VLSI Design Dept. of Electronics and Communication Engineering, SRM University, Kattankulathur, India.

Abstract : This paper presents a high-speed lowcomplexity modified Radix --2^5 128-point fast Fourier transform (FFT) processor using an eight data-path pipelined approach for high rate wireless personal area network applications. A novel modified radix -- 2^5 FFT algorithm that reduces the hardware complexity is proposed. This method can reduce the number of complex multiplications and the size of the twiddle factor memory. It also uses a complex constant multiplier instead of a complex Booth multiplier. The proposed FFT processor achieves a signal-toquantization noise ratio of 35 dB at 12 bit internal word length. The proposed processor has been designed and implemented using 90-nm CMOS technology with a supply voltage of 1.2 V. The results demonstrate that the total gate count of the proposed FFT processor is 290 K. Furthermore, the highest throughput rate is up to 2.5 GS/s at 310 MHz while requiring much less hardware complexity.

#### I. INTRODUCTION

Due to the advanced VLSI technology, Fast Fourier Transform (FFT) has been applied to wide field of Digital Signal Processing (DSP) applications. For modern communication systems, the most important FFT application is the Orthogonal Frequency Division Multiplexing (OFDM). OFDM is an advanced modulation technique that effectively expands channel utilization and reduces Inter-Symbol Interference (ISI) and Inter-Carrier Interference (ICI) caused by multi-path effect. It is mainly used in Digital Audio Broad-casting (DAB), DVB-T and Digital Video Broadcasting-Handheld. This can be used for services such as HDTV, offering increased capacity for program broadcasting.

Recently, with the increased cell-based IC design resources in chip technology, the computation of complex algorithm, such as the FFT can be implemented on a chip. A single chip can operate for different length FFT to meet the requirement of various OFDM communication standards. As a result of complex control, the variable-length FFT (VL-FFT) processor should have little extra hardware cost compared with the fixed length FFT processor.

Third generation systems are an extension on the complexity of second-generation systems and are expected to be introduced after the year 2000.

The system capacity is expected to be increased to over ten times original first generation systems.

The Telecommunications industry faces the problem of providing telephone services to rural areas, where the customer base is small, but the cost of installing a wired phone network is very high. One method of reducing the high infrastructure cost of a wired system is to use a fixed wireless radio network. The problem with this is that for rural and urban areas, large cell sizes are required to get sufficient coverage.

The OFDM technology was first conceived in the 1960s and 1970s during research into minimizing ISI, due to multipath. The expression digital communications in its basic form is the mapping of digital information into a waveform called a carrier signal, which is a transmitted electromagnetic pulse or wave at a steady base frequency of alternation on which information can be imposed by increasing signal strength, varying the base frequency, varying the wave phase, or other means. Multiplexing is the process of sending multiple signals or streams of information on a carrier at the same time in the form of a single, complex signal and then recovering the separate signals at the receiving end.

## II. RELATED WORK

A novel 128-point FFT/IFFT processor for ultra wideband (UWB) systems. The proposed pipelined FFT architecture, called Mixed-Radix Multipath Delay Feedback (MRMDF), can provide a higher throughput rate by using the multi data-path scheme. Furthermore, the hardware costs of memory and complex multipliers in MRMDF are high in the known FFT processor by means of the delay feedback and the data scheduling approaches. In this brief, a high-throughput and lowcomplexity Fast Fourier Transform (FFT) processor for wideband orthogonal frequency division multiplexing communication systems is presented. A new Indexed-Scaling method is proposed to reduce both the critical-path delay and hardware cost by employing shorter word length. Together with the Mixed-Radix multipath delay feedback structure, the proposed FFT processor can achieve very high throughput with low hardware cost. The signal-toquantization-noise ratio (SQNR) performance of this test chip is over 32.7 dB to support 16-QAM modulation.

In this paper, a novel modified radix--2<sup>5</sup> FFT algorithm and a 128-point FFT/IFFT processor architecture, which can provide a high throughput of 2.5 GS/s and SQNR of 35 dB for16-QAM applications are proposed. For achieving a high data throughput, reduced hardware complexity and higher SQNR performance are described. A novel modified radix-2<sup>5</sup> FFT algorithm that reduces the hardware complexity is proposed. This method can reduce the number of complex multiplications and the size of the twiddle factor memory. It also uses a complex constant multiplier instead of a complex Booth multiplier.

## **III. FFT ALGORITHM**

## A. Introduction

Discrete Fourier Transform (DFT) is widely applied to the analysis, design, and the implementation of discrete-time signal processing algorithms and computations. The computational complexity of direct evaluation is rather high so that it is hard to meet high performance and low cost design goal. Therefore, a fast DFT algorithm is required. Since the early paper proposed by Cooley and Turkey in 1965, they provided a lot of ways to reduce the computational complexity from  $O(N^2)$  to O(Nlog<sub>2</sub>N). After that, the algorithms that can reduce the computational complexity are generally called Fast Fourier Transform (FFT) algorithms. The Fast Fourier Transform (FFT) algorithm, together with its many successful applications, represents one of the most important advancements in scientific and engineering computing in this century. The wide usage of computers has been instrumental in driving the study of the FFT, and a very large number of articles have been written about the algorithm over the past thirty years.

FFT algorithms are based on the fundamental principle of decomposing the computation of the Discrete Fourier Transform of a sequence of length N into successively smaller Discrete Fourier Transforms. The regularity of the algorithms makes them suitable for VLSI implementation. FFT algorithms are recursive in structure to solve a given problem, they call themselves recursively one or more times to deal with closely related sub-problem. They typically follow a divide-and-conquer approach. Three major steps of the divide-and-conquer paradigm are:

- 1. Divide the problem into two or more subproblems of smaller size.
- 2. Solve each sub-problem recursively by the same algorithm. Apply the boundary condition to terminate the recursion when the sizes of the sub-problems are small enough.
- 3. Obtain the solution for the original problem by combining the solutions to the sub-problems.

Within this framework, there are two commonly used FFT variants that differ in the way the two half size sub-problems are defined. They are referred to as the DIT (Decimation In Time) and the DIF (Decimation In Frequency) FFT. There is no difference in computational complexity and Signal Flow Graph (SFG) between two types of algorithms, here in only focus on DIF FFT algorithms. Since the low computational complexity of FFT algorithms is desired for high speed and low power consideration in VLSI implementation, the computational complexity of different algorithms.

# IV. THE FFT PROCESSOR ARCHITECTURE

# A. Pipelined Architectures

Pipelined architectures are useful for FFTs that require high data throughput. The basic principle with pipelined architectures is to collapse the rows, instead of the stages like in column architectures. The architecture is built up from radix butterfly elements with commutators in between. An unscrambling stage is sometimes added on the input or output side of the processor, if the output data is needed to be in natural order. The advantage with these architectures, are for instance, high data throughput, relatively small area and a relatively simple control unit.

The basic structure of the pipelined architecture is shown below. Between each stage of radix-r PEs there is a commutator (denoted C in the picture). The last stage is the unscrambling stage (denoted U in the picture). The commutator reorders the output data from the previous stage and feeds to



International Journal of Innovative Trends and Emerging Technologies

the following stage. The unscrambler rearranges the data in natural sorted order.

## B. Pipeline-Based FFT Architecture

The pipeline-based architectures are the most popular FFT processor architectures because they are designed by emphasizing speed performance and the regularity of data path. The best way to obtain the pipeline-based FFT architectures is through vertical projection of signal flow graph (SFG). In view of Figure 1, take an example to explain a projection mapping for 8-point radix-2 DIF FFT.



Fig.1 General Structure of a Pipelined FFT Architecture

The structure of each stage obtained from the projection mapping is called the processing element (PE). A processing element contains a buttery unit (BF) for addition and subtraction between two input data of each stage, a complex multiplier and a block of buffer to store and reorder data for the buttery unit of next stage.

There are two types of data buffering structures for pipeline-based FFT processor. One is Delay-Commutator (DC), and the other is Delay-Feedback (DF). According to the structure difference, pipeline-based FFT architecture can be divided into three types:

#### V. THE FIXED LENGTH FFT PROCESSOR

#### ARCHITECTURE

#### A. Introduction

With the Television and the broadcast are digitized DAB, DVB-T and DVB-H will play very important roles in the future. They all adopt the OFDM system to solve the problems of ISI. FFT processor is the key to an OFDM system, because it is one of the area consuming and power-hungry blocks in an OFDM system. In this chapter, design and implement a fixed-length FFT (FL-FFT) processor that operates on variable data length of 2, 4, 8, 16, 32, 64 etc, points is designed.

From the analysis high-radix algorithm has lower computational complexity. But the high-radix algorithm can't be applied to all sizes of FFT for the OFDM applications. It must insert a radix-2 stage or two radix-2 stages to deal with the 2n-point or 4npoint FFT. Such limitation does not exist in splitradix FFT algorithm, but compared with fixed-radix FFT algorithm it is more difficult to implement in hardware due to the irregularity of its SFG. By these results, adopt the radix-2<sup>5</sup> FFT algorithm for our FL-FFT design. The FFT radix-2<sup>5</sup> algorithm is the proposed FFT processor when compared to radix-2<sup>4</sup> algorithm. From the analysis in Chapter 4, that there are three kinds of popular pipeline architectures used to implement FFT algorithms. By comparing them, find that SDF has less hardware requirement and higher utilization rate, but longer FFT computation time. Adopt the SDF architecture in proposed pipeline FFT processor.

#### **B.FFT Butterfly**

Butterfly unit will perform complex addition and subtraction. And it will perform twiddle factor -j multiplication with the help of multiplexers and control signals. MUX based bypass circuits are used to control FIFO input and butterfly computation input. FIFO length will be changed based on number of FFT points used and also based one FFT stage.

C. Shift Register

Dual-port Register file is used to implement the FIFO. The FIFO functions as shift register. Here we use FIFO as a delay element. All input values are saved into the FIFO until the N/2 input reached with current input. Then, the butterfly units conduct calculations between the current input values and FIFO outputs. From the butterfly unit outputs, the complex addition outputs will be forwarded into next stage. And the complex subtraction outputs will be entering into the FIFO.

D. Complex Multiplier

There are three types of multiplications in the architecture. One is the multiplication by -j. The second is the multiplication by a constant twiddle factor. The third is the multiplication by a complex twiddle factor.

E. Constant Twiddle factor Multiplication





Paper ID # IC15004



# V. CONCLUSION

The proposed work is to design a lowpower and area-efficient fixed-length FFT processor, which can be used for 32 point FFT. It's characterized of achieving low-power consumption, small area cost and high flexibility by adopting radix- $2^5$  algorithm.

The processing units consist of Radix-2<sup>5</sup> butterfly module to provide flexibility is implemented and simulated in Modelsim. The use of Radix-2<sup>5</sup> is reduces the number of complex multiplication by using the twiddle factor properties. FFT architecture which supports 32 point FFT will be implemented.

To achieve an proper reconfigurable algorithm for 64/128 point FFT processor using Radix-2<sup>5</sup> implementation. And to reduce the complex multiplication by using reconfigurable property with less power consumption and high speed.

## REFERENCES

- Qingwang Lu, Xin'an Wang and JiuChong Niu, "A Lowpower Variable-length FFT Processor Base on Radix-2<sup>4</sup> Algorithm", IEEE 2009, Key Lab of Integrated Microsystem Science & Engineering Applications Peking University Shenzhen Graduate School ShenZhen, China.
- Taesang Cho; Hanbo Lee Very Large Scale Integration (VLSI) Systems, IEEETransaction\_on Volume: 21, Issue: 1 Publication Year: 2013, Page(s): 187 - 191
- J. Lee and H. Lee, "A high-speed two-parallel radix-2<sup>4</sup> FFT/IFFT processor for MB-OFDM UWB systems," *IEICE Trans. Fundam.*, vol E91-A, no. 4, pp. 1206–1211, Apr. 2008.
- Y. Chen, Y. Tsao, Y. Wei, C. Lin, and C. Lee, "An indexedscaling pipelined FFT processor for OFDM-based WPAN applications," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 55, no. 2, pp. 146–150, Feb. 2008.
- Jung-yeol Oh and Myoung-seob Lim, "Fast Fourier transform processor based on low-power and area-efficient algorithm", Proc. IEEE Asia-pacific Conference on Advanced System Integrated Circuits, pp. 198-201, Aug 2004.
- T. Cho, H. Lee, J. Park, and C. Park, "A high-speed lowcomplexity modified radix-2<sup>5</sup> FFT processor for gigabit WPAN applications," in *Proc.IEEE Int. Symp. Circuits Syst. (ISCAS)*, 2011, pp. 1259–1262.
- K. Chalennsuk, R.H. Spaanenburg, L. Spaanenburg, M. Seutter and H.Stoorvogel, "Flexible-length fast fourier transform for COFDM", 15th IEEE International Conference on Electronics, Circuits and Systems, pp. 534-537, Aug 2008.
- Y.-I. Lin, P.-Y. Tsai and T.-D. Chiueh, "Low-power variablelength fast Fourier transform Processor," IEE Proc. Computers and DigitalTechniques, vol 152, pp 499-506, July 2005.
- K. Cho, K. Lee, J. Chung, and K. Parhi, "Design of low-error fixedwidth modified booth multiplier," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 12, no. 5, pp. 522–531, May 2004