A New Relation between “Twiddle Factors” in the Fast Fourier Transformation

1 Abstract —The fast Fourier transformation algorithm (FFT) probably is the most important algorithm in the digital signal processing. It is an efficient algorithm to the discrete Fourier transformation which determines the frequency components of a discrete time-varying signal. Nowadays, it has a huge impact on the modern society because the FFT is running on more billion devices (e


I. INTRODUCTION
The discrete Fourier transform (DFT) is among the most fundamental method in the digital signal processing (DSP).However, its wide use was restricted by its computational needs.In 1965 Cooley and Tukey presented an efficient algorithm to the DFT which reduces the number of operations from N 2 to Nlog2(n) where N is power of two (2 n ) [1].It was an important milestone in the DSP research.Thereafter, many articles presented refinements to the original algorithm such us decimation in frequency (DIF), higher radices, specialised fast Fourier transform (FFT) to real data, etc.A clear and detailed review about the FFT evolution can be found in [2].
Nowadays, more methods exist for computing the DFT efficiently.Generally, the radix-4 and split-radix algorithms are used when the sample size is power of two whereas the prime factor algorithm is popular for size having co-prime factors [3]- [6].
In many cases, the very-large-scale integration (VLSI) design largely determines the usefulness of a given FFT algorithm.It means that the efficiency of an FFT algorithm depends on the applied processor architecture [7], [8].In addition, the most FFT algorithms are well parallelisable, therefore, in multiprocessor systems the parallelisation accelerates the transformation [9]- [11].However, in single processor systems the parallelisation does not cause significantly acceleration.In this case, a sequential algorithm is more advantageous.
Independently of the applied implementation technique and architecture the relations which will be discussed in section III are similarly important.

II. THEORETICAL BACKGROUND
The DFT is a main part of the Fourier analysis and it is a very important part of DSP because various DSP applications such as filtering and correlation analysis depend on it.Generally, the DFT of an x(α) time-varying signal with N samples where X(κ) denotes the frequency coefficients, TN ακ is the "twiddle factor", α is index in the time domain, κ is index in the frequency domain and i is the imaginary unit.Every fast algorithm for DFT is based on the "divide and conquer" idea.The radix-2 decimation in time (DIT) FFT decomposes the input signal into its even and odd components, thus the DFT can be written as x (2 ) x (2 1) .
In the above equation TN κ in the κ'th root of unit ("twiddle factor" where α = 1).Equation (3) can be written in a simpler form (4) if we use the symmetry (5) between roots: A New Relation between "Twiddle Factors" in the Fast Fourier Transformation In (4) κ = 0, …, N/2-1, in (5) κ = N/2,…, N-1 and α = 0, …, N/2-1 in both formulas.When the number of samples is a power of four (N = 4 n ), the radix-4 FFT algorithm (7) is more efficient than radix-2.The radix-4 and split-radix (mixture of radix-2 and radix-4) FFT algorithms take advantage of another symmetry of roots (8) to reduce the multiplications by ±i [12].The radix-4 algorithm splits the input sequence into four subparts thereby it decreases the number of stages by one [13], [14]: In the above equation ρ denotes the subpart index.

III. THE PROPOSED RELATIONS
Equation ( 7) can be written in the following form where Re and Im refers to the real a complex parts: The above new forms are more efficient than (8), because these do not require complex multiplication.
Beyond ( 6) and ( 8), another important relation exists between the roots.According to the previous statements, we should focus on the first N/4 roots.For instance, if we assume N = 16 and we use the Euler form (11), the first N/4 roots will be ( 12)- (15).The Euler form separates the real and imaginary parts of a complex number and it is very useful in programming because the programmer can contain the two parts in two different vectors in the program code: In the above equations there is a further relation between ( 13) and (15) This comes from the property of the sine and cosine functions because there is a π/2 shift between them [15]: The above relations can be observed between the roots when N is power of two.Generally, it can be written as: Equation ( 22) is the proof of ( 20) and ( 23) is the proof of (21): (2 1)π cos( ). 2 Finally, the newly generated properties ( 20) and ( 21) can be written in more efficient and simpler forms (24) and ( 25), similarly as in ( 9) and ( 10), if we suppose that N ≥ 16 and 0 < k < N/4: Figure 1 shows the location of the roots in the complex plane and the effect of ( 6), ( 9), ( 10), ( 24) and ( 25) relations (N = 16).This clearly illustrates that it is enough to compute the first (N/8 + 1) roots to the FFT algorithm.
The new relation reduces the number of complex roots again, thus it saves memory and/or computational capacity.Consequently, it is enough to calculate the first (N/8 + 1) roots because other roots can be derived.

IV. EXPERIMENTAL RESULTS
In order to measure the effect of the proposed relations, we created a modified radix-2 FFT algorithm which utilises ( 6), ( 9), ( 10), ( 24) and ( 25).The new relations modify the original "butterfly" structure.Figure 2 illustrates a new structure where the proposed relations were applied in a radix-2 DIF FFT algorithm (N = 16).
The modified algorithm was compared with two general radix-2 implementations which can be found in [16], [17].In most cases the implementation depends on the application type.We should implement the transformation according to the architecture, number of samples (N) and the applied programming language or abstraction level.This test mainly focuses on that case when the algorithms are sequential and the sample size is power of two.As it was mentioned previously, the efficiency of the algorithm depends on the architecture.Therefore, in the test four different architectures were used so that the survey will be more reliable.
 Microchip PIC32 32-bit MIPS processor  Raspberry Pi (RPi) minicomputer  BeagleBone Linux computer  Simple PC All used devices have different properties.The Chipkit Max32 board contains a PIC32 microcontroller which has 80 MHz clock signal and includes a 128K SDRAM.The central processing unit of the RPi is an ARM11 which is running at 700 MHz.The BeagleBone contains an AM335x 720 MHz ARM Cortex-A8 processor.Finally, the PC has an Intel Pentium 2.2 GHz processor.Obviously, the speed of an algorithm depends on lots of parameters but now these are insignificant.During the test, the three algorithms (modified FFT, FFT A and FFT B) get a random signal with different size and calculate the transformation 100 times.Finally, the program gives back an average value about the runtime.
Each FFT algorithms were implemented in ANSI C programming language, because all the four devices ensure C compiler.Table I shows the test results where the time dimension is in second.V.CONCLUSIONS In the paper we presented a new relation between "twiddle factors" in ( 20) and ( 21) and its proof in (22) and ( 23) which was not mentioned in the literature yet.This relation reduces the number of necessary roots to the fast Fourier transformation.Moreover, we proposed more efficient forms to an existing and to the new relation in ( 9), ( 10), ( 24) and (25).To support the proposed relations, we made a survey where a modified radix-2 FFT was compared to two other algorithms.The experimental results clearly indicate that the modified FFT is more efficient than other general algorithms.Since less roots are enough for the algorithm thus the root calculation time is greatly reduced.In addition, the test results show architecture dependence of algorithms.FFT A is faster than FFT B on the BeagleBone while on other devices FFT A is slower.Furthermore, the modified FFT is more efficient on PIC32 and BeagleBone than on the RPi.
Today, the Internet of Things (IoT) is a dynamically extending research area.The IoT is a network of different things or objects which can communicate sense and interact to each other via the Internet [18]- [21].In many cases, the objects are little devices which belong to two main categories: controllers and sensors.A controller can be a mini-computer, microcontroller and FPGA [22]- [23].Generally, the controllers perform every data processing.However, most controllers have limited calculation and storage capacity.This can cause problem when the controller performs DSP algorithms or applications.It means that, an optimised FFT algorithm which utilises every relation (in optimised form) between the complex roots can be very useful on such devices.Moreover, the power consumption will be similarly more effective.

Fig. 1 .
Fig. 1.The effect of the relations.On the figure the same colour indicates that roots which are in relation.

Fig. 2 .
Fig. 2. The "butterfly" structure of the modified algorithm.On the figure +/-indicates the correct sing according to the proposed relations.

TABLE I .
TEST RESULTS.