Efficient Design of FFT Module Using Dual Edge Triggered Flip Flop and Clock Gating.

R. Preyadharan, PG scholar, Knowledge Institute of Technology, Salem, Tamil Nadu
A. Tamilselvan, Assistant Professor, Knowledge Institute of Technology, Salem, Tamil Nadu.

Abstract—In this paper, an approach to develop Fast Fourier Transform (FFT) module is with the help of the architecture level and system level is proposed. OFDM is used in many communication systems such as high data rate mobile wireless communication. In the OFDM architecture, FFT is used to modulate the data where it is converted from time domain to frequency domain at the receiver and IFFT converts frequency domain to time domain at the transmitter. Both transforms are same; the only difference is the twiddle factors in each being the complex conjugate of one another. FFT is efficiently used to compute Discrete Fourier Transform (DFT) and its inverse. In this proposed system, FFT used in Orthogonal Frequency Division Multiplexing (OFDM) is realized using dual edge triggered flip flop (DETF) instead of using single edge triggered flip flop traditionally in the architectural level and using the gated clocking for the input system in order to reduce the power consuming. Speed of processing is one of the important factors of the blocks in OFDM system. By using DETFF, it captures and propagates the data at both clock edges hence it is suitable for high data rate applications and also speed of FFT module can be increased. Power consuming in the design due to the clock dissipation by using the clock gating method to reduce the clock power. This proposed structure can modify the designs in both architectural structure and system level structure.

Index terms— Fast Fourier Transform, OFDM, Twiddle Factors, Dual Edge Triggered Flip flop, Clock Gating.

I. INTRODUCTION

As the usage of communication equipment increases, the demand for high data rate also increases. The increase in data rate causes distortion in the multipath channel. This leads to emerging multicarrier modulation techniques. This technique divides the high stream data into several low stream data to reduce the effects of distortion. But this technique uses low bandwidth efficiency and interchannel interference (ISI) due to the interval between the adjacent channels. OFDM uses the concept of orthogonality of subcarriers that provides high bandwidth efficiency and ISI which are the problems of multicarrier modulation technique. So OFDM is the solution for high data rate communication problems.

OFDM is an encoding process that codes digital data on multi carrier frequencies. OFDM is used for digital communication whether it is wired or wireless. The application areas include digital television, audio broadcasting, DSL internet access, wireless networks, power line networks, mobile communications, and asymmetric digital subscriber line (ADSL), wireless local area network (WLAN), and multimedia communication services. OFDM based on the principle of orthogonality in which each subcarrier is orthogonal to the another.

This means that cross talk is eliminated between the sub channels and guard bands are not required. The essential component used in OFDM is Fast Fourier Transform (FFT). More communication systems require FFT wherever low power and high speed applications are needed. The FFT algorithm is efficient to compute DFT and IDFT. There are more possible architectures for DFT but butterfly based architecture (referred to as FFT) are most widely adoptable due to its less computation time. The butterfly diagram is shown in figure 1.
The sequential logic blocks such as flip flops are used to store data which includes either single edge triggered flip flop (SETFF) or dual edge triggered flip flop (DETFF). The difference between them is SETFF captures data during either positive edge or negative edge whereas DETFF captures data during both positive and negative edges. Thus DETFF increases data transmission rate. Optical wireless communication (OWC) system employs orthogonal frequency division multiplexing. The two existing methods are asymmetrically clipped optical OFDM and direct current biased optical OFDM. The performance of bit-error ratio is compared for different clipping levels and multilevel quadrature amplitude modulation schemes [1]. In OFDM, data bits are encoded to multiple sub carriers. It grows dramatically in the field of wireless and wired communication systems. OFDM results in maximum usage of bandwidth [2]. Mobile WiMAX uses an OFDMA™ technology. Radix-2 Algorithm is used for the OFDM communication system [3]. Folding transformation and register minimization techniques are used to design FFT architecture. Parallel-pipelined real and complex valued fourier transform architectures are used to reduce the operating frequency and power consumption. The development of the Fast Fourier Transform (FFT) algorithm, based on Decimation-In-Time (DIT) domain, called Radix-4 DIT-FFT algorithm [5]. FFT algorithm is used in linear filtering, correlation and spectrum analysis. FFT algorithm can be performed in two ways those are Decimation-In-Time (DIT) and Decimation-In-Frequency (DIF). Speed of both of these FFT algorithms mainly rely on the multiplier [4]. Double Edge Triggered D-Flip Flop (DETFF) which is suitable for low power and high performance applications. DETFF is having less number of clocked transistors, lowest average power and least delay than existing designs [7]. Flip flop is an important component in digital circuits. Power distribution is reduced in low switching activity by incorporating the Dual Edge Triggered Flip Flop. Latency of the flip flop can be minimized by making use of a fast schematic latch [8].

Fig.1 Basic butterfly diagram of Radix-2 FFT

II. DUAL EDGE TRIGGERED FLIPFLOP

Explicit-pulsed dual-edge triggered sense-amplifier flip-flop has led to an improved common mode rejection ratio [9]. Redundant transitions are eliminated by using discharge conditional technique [10]. Normally a design consists of combination of combinational circuit and sequential circuit. The speed of design depends only on sequential circuit. The sequential circuits are commonly said to be flip flops and latches. The flip flops can be used for all memory operations. The speed of the design only depends on operating speed of sequential circuits. In sequential circuits a new design can be used as dual edge triggered flip flops.

Dual edge-triggered flip flops are suitable for low-power designs since they effectively enable a halving of the clock frequency. That is the main advantage of DETFF is it maintains a constant throughput only at half the clock frequency. A single-edge triggered flip flop can be implemented by two latches in series; a double edge-triggered flip flop can be implemented by two latches in parallel. The clock signal is assumed to be inverted.
locally. The dual edge triggered flip flop does not have a template such that it can be achieved by using two D flip flops with a multiplexer, such that the clocking signal can be given positive edge for flip flop and negative edge for another flip flop. By this we can achieve a template for dual edge triggered flip flop. In the design by increasing the speed of sequential circuits, the processing speed of the design can be achieved. In this case by using a DETFF the speed and low power device can be implemented.

Fig.2 Block diagram of DETFF

In a synchronous system, the operations on input data sequences produce the output sequence with some predetermined time relationship. This timing relationship of computations is controlled by flip-flops and latches together using a global clock, as shown in figure 2. These clocked storage elements, flip-flops and latches store values according to their inputs. This classification as flip flop and latch is based on their behavior during the clock phases. A latch is level sensitive and a flip flop is edge sensitive. A latch is transparent and propagates its input to the output during one clock cycle (clock either positive or negative), whereas holding its value during the other clock phase. A flip flop captures its input and propagates it to the output at a clock edge (rising or falling) and keeps the output constant at any other time. The design of storage element in sequential circuits is mainly depend on the clocking and circuit topology. As the paper targets mainly on synchronous system with edge-triggered clocking, only flip-flop is discussed. Specifically dual edge-triggered flip-flops are used to improve the data rate. By using both the clock edges of flip flop to capture the data it further improves the latency and power efficiency. This type of data capture results in reduction of power.

2.1 Dual Edge Triggered Flip Flop Clock Pulse

In digital system, synchronous circuits operates depend on the synchronous clock. The clock provided to the circuit at regular interval of time. The circuit produces the output and changes their state at each instant of time. Thus the working of the system is controlled by the clock signal. If a set of gates and flip-flops are interconnected in a system of synchronous circuits, the clock signal controls all gates and flip-flops to sample and store their input data synchronously. The clock is the major power consumer in the design. By utilizing the clock power and time can be reduced which will be result in operating speed and reducing delay. The clock skew is the major problem which arises at large number of input design. In that case buffers are used to overcome this problem. In the DETFF it can use to capture data at both positive edge and negative edge of clock. Thus the design can perform operation at both edges of the clock. The DETFF can perform the operation within a single clock cycle compared to SETFF that operates at two clock cycle and also time of execution is half as that of SETFF.

2.2 Dual Edge Triggered Flip Flop Characteristics

The proposed characterization is related to flip-flops used in high-performance data-path applications. In a typical pipeline stage the logic processes data supplied by triggering flip-flops and delivers the results to the capturing flip-flops. This logic path environment dictates the system performance. The characteristics of DETFF and SETFF are shown below. In this the clock cycle is described at first and the data input is shown next to the clock. The SETFF capture the data at only when the clock is at positive edge and idle for remaining edge but the DETFF capture the data at both the edges of clock. DETFF behaves differently in both the rising edge and falling edge in the clock. But SETFF fails to capture the data at negative edge of clock.
A. Comparison Of Dual Edge Triggered Flip Flop And Single Edge Triggered Flip Flop

<table>
<thead>
<tr>
<th>S.NO</th>
<th>Specification</th>
<th>SETFF</th>
<th>DETFF</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>No of flip flop used</td>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>2</td>
<td>No of registers</td>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>3</td>
<td>No of input &amp; output buffers</td>
<td>2</td>
<td>2</td>
</tr>
<tr>
<td>4</td>
<td>Time for execution</td>
<td>6.216 ns</td>
<td>6.394 ns</td>
</tr>
</tbody>
</table>

Since the importance of designing low-power and high performance timing elements has been recognized, many latches and flip-flops have been designed. Hybrid-Latch Flip-Flop (HLFF) is a fastest flip-flop in negative setup time which provides smooth clock edge property, but power consumes is in large amount because of redundant problem of internal blocks. Conditional-Capture Flip-Flop (CCFF) is another high performance flip-flop which can be used for elimination of redundancy of internal transformation to reduce power dissipation.

III. PROPOSED ARCHITECTURE

In this project we develop the FFT block using Dual Edge Triggered Flip Flop (DETFF). Normal Flip Flop captures the data in any one edge of the clock (positive edge clock or negative edge clock) but in DETFF the data is capture both edge of the clock (positive edge clock and negative edge clock). So by using the DETFF in the pipeline architecture we reduce the time of execution. Twiddle factor is used in the design as a ROM so that the execution speed will be increased.

3.1 Simulation Results

Simulation Result For DETFF

![Fig.5 Simulation output of DETFF](image)

The dual edge triggered flip flop was coded in the Verilog and verified using MODELSIM simulator. The input values are given through the variable 'd'. The output is denoted in the variable ‘q’. The simulation result is displayed in figure 5. In this output Figure data capture in the positive and negative edge has monitored. Input value d=0 and clk=positive edge then output is q=0 then change the input value is 0 to 1 obviously the output has changed in 1 in the negative edge of clk. Simulation result denotes positive edge data capturing and negative edge data capturing.

3.2 Input Sequence Of Design

The input in form of binary value is given in the MODELSIM by force as the input by selecting the input objects in the design. Figure 6 describes the input sequence of our design.
3.3 Simulation Result for 2-Point FFT

Two point FFT is the initial step in the design of FFT Butterfly structure. The simulation result is placed below

In this Figure 7 the inputs are a1,b1,c1,d1,ax,bx,cx,dx and the outputs are rebase, reexp, imgbase, imgexp, rebase1, reexp1, imgbase1, imgexp1. The number of 2-point radix-2 FFT structure used as 8.

3.4 Simulation result For 4-Point Radix 2 FFT

The second stage of the 8-point radix 2 FFT design is 4-point FFT. The first stage output has given has an input of this stage. Some 2-point outputs are multiplied to the twiddle factor values. Multiplied output and all other first stage outputs are forwarded to second stage that is the 4-point butterfly structure. The simulation output is below.
3.5 Simulation Output Of The 8-Point FFT

This is the third and final stage of our design. The second stage output is forwarded to an input of this stage. The output of the final stage is displayed in figure 9.

A. Point Radix 2 FFT

All input values are given has 1 so the output of our design is 8 0 0 0 0 0 0 0. This value is verified using the MATLAB program.

B. Synthesis Process

Xilinx is a tool used for the synthesis process. This is a final process to convert the normal Verilog code to RTL code. This RTL code is synthesized by Xilinx tool and it also generate the schematic diagram of our design. Figure 11 is the schematic of our design. This schematic consist of four 2-point butterfly structures and two 4-point butterfly structures and one 8-point radix2 FFT blocks and it also consist of 4 twiddle factor modules and 8 complex multiplier blocks.
The synthesis report has follows:

Selected Device: 3s100evq100-5

Number of Slices: 331 out of 960 34%
Number of Slice Flip Flops: 444 out of 1920 23%
Number of 4 input LUTs: 527 out of 1920 27%
Number of bonded IOBs: 373 out of 66 565% (*)
Number of GCLKs: 1 out of 24 4%

Timing Summary

Speed Grade: -5
Mini period: 13.282ns (Max Frequency: 5.289MHz)
Minimum input arrival time before clock: 10.651ns
Maximum output required time after clock: 5.467ns
Maximum combinational path delay: 9.477ns.

CONCLUSION

The function of FFT block is very crucial in wireless applications. Normally any design can be realized in-terms of combination logic plus sequential element (most probably single edge flip flop). In our design we successfully realized FFT using double edge triggered flip flop instead of traditional single edge triggered flip flop. This method helps us capture data during both rising and falling edge of the clock. The RTL code had been synthesis with Xilinx spartan3 family to achieve a maximum frequency of 75.289MHz. Since there is no pre-defined template available for Dual edge triggered Flip Flop in FPGA; so we achieved Dual edge triggering using two flip flops and 2:1 Mux. Even though it has increased area and timing while comparing to traditional single edge triggered flip flop based FFT; its reasonable tradeoffs since we able to capture date at ideal edge of clock. If same design implemented using latest ASIC Or FPGA Family (where DTFF is available) high area and timing optimization can be acheived.

REFERENCES


[12] Shingo Yoshizawa, HirokazuIkeuchi, and Yoshikazu Miyanaga Graduate School of Information Science and Technology, Hokkaido University, Japan, “Scalable Pipeline Architecture Of Mmse Mimo Detector For 4x4 Mimo-Ofdm Receiver”.


