"Behind every stack of books there is a flood of knowledge."
The Fourier transform has been widely used in the field of digital signal processing. This method is also utilized to demodulation of OFDM receiver side. In wireless communication systems, we must change the number of the fixed data point which use the Fourier transform, and we must also change it to obey the standards. We would like to change from the fixed data point to the variable data point. As a reason, we are troublesome that we change the fixed data point by every standard. The purpose of this text is we design FFT circuit.
MATLAB 7.9.0 (R2009b) XILINX®System Generator for DSP 12.1
DFT (Discrete Fourier Transform) is the fourier transform on the discrete time domain, and this method is often used for the frequency analysis of a discrete digital signal in the digital signal processing. Now, we perform the frequency analysis of discrete time domain signal x[n] ( x[n] is complex values).
We obtain the N-sample signal in frequency domain (complex values) X, … , X[N-1] which use DFT to the N-sample signal in time domain x, … , x[N-1].
Using the twiddle factor , the discrete Fourier coefficients are expressed as equation (1.2)
The case of N=4 is shown using the matrix representation in equation (1.3).
Calculating equation (1.3) results are as following equations (1.4) ～ (1.7).
According to equations (1.4) ~ (1.7), the N=4 DFT requires 16 times complex multipliers, and 12 times complex adders. Similarly, the N-point DFT requires N2 times complex multipliers, and N(N-1) times complex adders. The most of processing time which calculate the DFT using computer is the calculation time of the complex multipliers. If the number of complex multipliers is significant reduced, the processing speed of the DFT is expected to increase significantly.
FFT (Fast Fourier Transform) is an algorithm, which speeds up the calculation of the discrete Fourier transform. It was proposed by John James Cooley in 1965.
The algorithm is described as follows
For example, in case of N=4, the matrix representation of the discrete Fourier coefficients is shown as equation (1.3).
By even-odd sorting the index of the input column X[k], the following eq. (2.1) is obtained
Here, we consider about the property of the twiddle factor. The twiddle factor has the periodic property as be shown in equation (2.2)
The proof of equation (2.2) is as follows.
The left-side of equation (2.2) is rewritten as following equation:
The N power of the twiddle factor is :
According to the Euler’s identity , we have
Calculating in the same way, the N/2 power of the twiddle factor
Equation (2.1) is represented by using the periodic property of the twiddle factor as follow:
This matrix decomposes (2.4) because coefficient matrix (2.3) has the regularity.
Decomposed coefficient matrix (2.4) can change as follows.
O is 2×2 zero matrix I is 2×2 identity matrix
We calculate of red part in (2.5).
Proceed with further calculations become to (2.7).
Calculated result (2.9),(2.10),(2.11),(2.12) correspond to discrete fourier transform result (1.4),(1.5),(1.6),(1.7) use of periodicity of the twiddle factors.
DFT of N =4 needs 16 times complex multiplier and 12 times complex adder. However, in the case of (2.8) can do 4 times complex multiplier and 4 times complex adder. This algorithm is called fast fourier transform. It is specially lower complex multiplier than DFT.
3. Radix-2 algorithm
We talk about Radix-2 algorithm and butterfly computation. It shows fourier transform result of X(k) in last chapter.
Their formulas are coordinated of even or odd at sigma, as (3.1)
One FFT can divide two FFT at N even and N odd.
NIf N is power of 2, it can divide FFT and more. Finally, it divide several FFT(butterfly computation) of N=2. This is Radix-2 algorithm. Its algorithm usually reduce of calculation, reason of using complex adder-subtractor only.
Fig.1 shows butterfly computation circuit using of butterfly computation.
Fig.2 shows calculation flow about discrete fourier coefficient of N =4. Note,
under the arrow of -1 is subtraction, twiddle factor at the top of line is multiplication.
Calculated results used of are changed (3.3) ,(3.4), (3.5), (3.6). Their results equal DFT results.
Basic concept of 4 points FFT circuit which applies Radix-2 architecture is shown in figure 3.
In figure 3，4 point FFT circuit can implement by using 4 butterfly processing elements.
Block diagram is shown in figure 4.
In figure 3, butterfly processing element that is shown red line make be corresponding to red block in figure 4. Another butterfly processing element is same.
This is the 4 points FFT circuit for parallel input.
The detail of each subsystem is shown below
In this part, we explain the Radix2_0, Radix2_1, Radix2_2 and Radix2_3 This circuit has 2 complex number inputs and 2 complex outputs. Therefore, real number and complex number must be separate input, and calculate each part. Architecture is shown figure 4.
Subsystem radix-2_1, radix-2_2, radix-2_3 is same as radix2_0.
AddSub1 in figure 6 is same as AddSub.
AddSub3 is also same as AddSub2.
This section shows the MULT to the MULT3 subsystem. The MULT subsystem multiplies a twiddle factor. The Complex number operation mutiple is showed by
As following this process, multiple twiddle factors in terms of complex input. figure 9 shows the circuit block of the MULT subsystem． Abbreviate explanations of the MULT1 subsystem to the MULT3 subsystem, because of the same architecture as the MULT subsystem against the constant block, the W_real and the W_imag of figure 9, parameters.
Abbreviate an explanation of the AddSub block, because of the same as the AddSub2 block of figure 6.
Also, abbreviate an explanation of the AddSub1 block, because of the same as the AddSub1 block of figure 6.
Figure 10 shows parameters of the Mult block.
Abbreviate explanations of the Mult1 block, the Mult2 block and the Mult3 block, because of the same parameters as the Mult block of figure 9.
Table 1 shows the Constant block parameters showing the twiddle factors depending on or in the MULT subsystem, the MULT1 subsystem, the MULT2 subsystem and the MULT3 subsystem.
Table 1: Parameter of the twiddle factor
The subsystem MULT multiplies the complex twiddle factor in terms of the complex input signals.
Input x to x of Table 2 into the FFT circuit of Fig.5, then got X to X of Table 2.
figure 11 shows the result of hardware simulation is consistent with the result of MATLAB simulation.
Table 2: Verification Result and Error on MATLAB
Then, figure 12 and figure 13 shows the waveform of input signals and output signals.
Depending on this figure, in case of parallel input, FFT has no latency.
As in the case with N=4, design block diagram from the algorithm of 8-point FFT.
Following the algorithm, 12 of the Butterfly computation circuit construct the N=8 FFT circuit.
Figure 15 shows block diagram.
The designed circuit is shown in figure 16.
We omit the description of subsystems radix2_1-radix-2_12, because they have the same circuit construction as the subsystem radix-2_0 in figure 6.
We also omit the description of subsystems MULT0-MULT11, because they have the same circuit construction as the subsystem MULT in Figure 7, except for the parameters of constant block.
The parameters of constant block in MULT0-MULT11 are shown in Table 3, where the twiddle factors are 、 、 、
Table 3: Parameters of twiddle factors
On MULT3 and MULT9 in Table3, we set 0.7071 as a round number for
The bit width of constant blocks of the subsystem MULT is 16-bit, and the fractional bit is 14-bit. can be expressed as 0.70709228515625 by the fractional bit.
If we design circuits with floating-point arithmetic, the scale of circuits becomes very large. Therefore, fixed-point arithmetic is generally used for calculations. However, in this case, error occurs due to constraints on bit width.
In figure 9, we simulate the behavior of the circuit. The input signals are x(0)=0+0j, x(1)=1+1j, x(2)=2+2j, x(3)=3+3j, x(4)=4+4j, x(5)=5+5j, x(6)=6+6j, x(7)=7+7j , then the output signals become X(0)=28+28j, X(1)=-13.64+5.678j, X(2)=-8+0j, X(3)=-5.635-2.365j, X(4)=-4-4j, X(5)=-2.3650-5.678j, X(6)=0-8j, X(7)=5.635-13.64j, and Input to Output latency of this circuit is 0.
The difference between the result of circuit simulation and MATLAB simulation is shown in Table The error range is about , and the cause of this error is rounding error which occurs by expressing in binary at constant block of MULT. In order to do more accurate calculation, we need larger bit width for fractional bit of constant block in the subsystem MULT.
Table 4: The difference between the result of circuit simulation and MATLAB simulation
In chapter A-1, we designed 4 point FFT circuit for parallel input.
The FFT circuit for parallel input can be designed easily, and it has an advantage that Input to Output latency is 0. However the circuit scale becomes very large.
If we designed FFT circuit with less butterfly circuits by reusing them with delay blocks and multiplexer (MUX) blocks, we could make the circuit scale smaller.
In this chapter, we design 4 point FFT circuit for serial input.
The conceptual block diagram is as follows.
The details of MUX_control and W_control blocks in Figure 19 is described as follows.
MUX1 and MUX2 output the values of d0 ~ d2 when sel=0~2. MUX_control decides the output signal of MUX1 and MUX2 multiplexers from d0 ~ d2
The output signal of MUX_control is variable at each step. Therefore, MUX_control can select the signal which is input to next butterfly circuit.
W_control decides the twiddle factor W.
This architecture computes butterfly operation changing the value of twiddle factor W (real part, imaginary part) at each step.
Each block parameters are shown in Table 5.
Table 5: W_control block paramters
From this table, MULT1 block operates multiply by at first step and at next step. MULT2 block always operates multiply by
The timing chart of 4-points FFT circuit for serial input signal is shown in Fig. 22.
simulation result of this architecture using x(0)=0+0j, x (1)=1+1j, x(2)=2+2j, x(3)=3+3j as input signal is shown in Fig. 23.
From this figure, the output signal of this architecture is X(0)=6+6i, X(1)=-4+0i, X(2)=-2-2i, X(3)=0-4i which is same as the MATLAB simulation result (Table 6).
Table 6: The comparison between circuit and MATLAB
And latency of this architecture is 3 clocks.
First, the way of 8-point FFT circuit design is explained as follow.
This architecture inputs serial input signal to butterfly operation circuit. The output signals of butterfly operation circuit are selected to compute next butterfly operation by multiplexer.
This circuit is shown in figure 27.
The MUX_control block is a sequence that controls the signal input to the selector of the multiplexer. This signals that determine the output of the multiplexer.
In figure 28, we shows the parameter of sequence.
In figure 28, we confirmed that MUX_control value output from each step has been changed. Only the signals required for the FFT butterfly computation circuit of output signal is controlled sent to the next butterfly computation circuit.
W_control is a sequence represented the rotation factor W
By varying the value of the complex twiddle factors are multiplied per STEP, butterfly computation is performed.
The complex twiddle factor is , , , In Table 7, we shows the parameter of sequence.
Table 7: Parameters of the W_control sequence
In Table 7, we used the is as approximate number 0.7071.
In Tabel 7, we confirmed multiply the first step in , in MULT0 of the figure 26, since then, in turn multiplied by , , , The first step in MULT1 multiplies the the next step is to multiply the The MULT2 are always multiplied by the
In figure 27, we show the 8-point FFT circuit timing chart of figure 29 for the serial input.
Because it is hard to recognize the value of under decimal point from output wave shown by figure 26, we show those values on MATLAB. x(0)=0+0j, x(1)=1+1j, x(2)=2+2j, x(3)=3+3j, x(4)=4+4j, x(5)=5+5j, x(6)=6+6j, x(7)=7+7j Output signals are
In Figure 30, as will be noted from the Table 8, first step on MULT0 is multiplying by X(0)=28+28j, X(1)=-13.6570+5.6575j, X(2)=-8+0j, X(3)=-5.6570-2.3430j, X(4)=-4-4j, X(5)=-2.3430-5.6575j, X(6)=0-8j, X(7)=5.6570-13.6570j
The deference between above values and the results calculated on MATLAB (Figure 27) is shown by table 8.
Maximum error is, A reason of this error is Round-off Error caused by representing with binary .
Table 8: Error compared to calculated results on MATLAB
A serial input signal is delayed and carried into Butterfly arithmetic circuit. Then, system separates output signals of Butterfly processing circuit into required signals for calculating FFT and no required signals by using Multiplexer.
After that, required signals are carried into next Butterfly arithmetic circuit. A circuit which we have made is shown by figure 34.
MUX Control block controls input signal carried into selector of multiplexer. Output signal of Multiplexer is decided by this control.
The sequence parameter is shown by figure 35.
In figure 35, output signal from MUX_control varies on each step. The only signals required for FFT are sent to next Butterfly arithmetic circuit by the signal varying.
W_control is sequence representing twiddle factor W.
Butterfly arithmetic is done by varying W complex multiplied on each step.
Each W are…, , ,, , , , Sequence parameters are shown by table 9.
Table 9: W_control sequence parameter
In Table 9, is expressed as 0.7071, is expressed as 0.9239 and is expressed as 0.3827.
In Figure 30, as will be noted from the Table 9, first step on MULT0 is multiplying by and the next steps are multiplying by , , , , , , in order. First step on MULT1 is multiplying by and the next steps are multiplying by in order. First step on MULT2 is multiplying by and the next step are multiplying by All steps on MULT3 is multiplying by
Timing diagram of 16-points FFT circuit for serial input is shown in Figure 36.
A simulation result for input signals that x(0)=0+0j, x(1)=1+1j, x(2)=2+2j, x(3)=3+3j, x(4)=4+4j, x(5)=5+5j, x(6)=6+6j, x(7)=7+7j, x(8)=8+8j, x(9)=9+9j, x(10)=10+10j, x(11)=11+11j, x(12)=12+12j, x(13)=13+13j, x(14)=14+14j, x(15)=15+15j is shown in Figure 37.
Calculation results on MATLAB and the errors are shown in Table 10.
Figure 10: Output waveform of 16-points FFT circuit for serial input
By Figure 37, the Latency between an input and an output for this circuit is 15.
The verification of 16-points FFT circuit is making m-file for test, and calculation of mean square error (MSE) between result on MATLAB and result for circuit by random inputs a hundred times. The result is shown in Table 11.
Table 11: Calculation result of MSE
By the Verification, this 16-points FFT circuit perform that MSE of real part is 、 and image part is
By the same expansion token, serial inputs are delayed and inputted into butterfly circuit, and only necessary outputs for FFT are inputted into next butterfly circuit. It is possible for the expansion by the control on Mux. In this way, Radix-2 -points FFT circuit can be designed. Furthermore, also Radix-4 -points FFT circuit can be designed.
In this level, designner must assume input signal coming in serially with HEAD signal as shown in figure 39. Design FFT circuit which generate FFT outputs and OUTHEAD signal with some latency. The latency is arbitrary.
Table 5: Pin list
|Signal name||in or out||bit width||explanation|
|HEAD||IN||１||‘1’ means the beginning|
|FFTIN_I||IN||8||Input Real components (Unsinged Int,0-255)|
|FFTIN_Q||IN||8||Input Imaginary components (Unsinged Int,0-255)|
|OUTHEAD||OUT||1||‘1’ means the beginning|
|FFTOUT_I||OUT||14||Output Real components, Keep 8-bit precision|
|FFTOUT_Q||OUT||14||Output Imaginary components, Keep 8-bit precision|
This should be implemented by only one circuit.
Although it is getting very difficult to compare many designs, please use the following design parameter normalization.
The UNIT AREA is defined as 50 inputs EXOR circuit.
The UNIT SPEED is the latency of 50 inputs EXOR circuit.
Please show your design’s area and speed by this normalized UNITS.
50 inputs exor VHDL：parity.vhd
8bit Input (Unsigned Fixed point: Arbitary Decimal Point)
Virtual Fashion Education
"chúng tôi chỉ là tôi tớ của anh em, vì Đức Kitô" (2Cr 4,5b)
News About Tech, Money and Innovation
Modern art using the GPU
Find the perfect theme for your blog.
Learn to Learn
Con tằm đến thác vẫn còn vương tơ
Khoa Vật lý, Đại học Sư phạm Tp.HCM - ĐT :(08)-38352020 - 109
Blog Toán Cao Cấp (M4Ps)
Indulge- Travel, Adventure, & New Experiences
"Behind every stack of books there is a flood of knowledge."
The latest news on WordPress.com and the WordPress community.