Polynomial Curve Fitting with Varying Real Powers

Curve fitting or function approximation is the process of fitting to series of data points with curves [1]. By this way, the data is denoted by any mathematical function. The fitted curves can be created by regression, splines, or interpolation methods. In general, polynomials are preferred as mathematical functions and the solution of polynomials can be performed by linear equation. Let M N  X be the input data and 1 M y be the corresponding outputs. The relation between inputs and outputs is given by linear equations [2]


Introduction
Curve fitting or function approximation is the process of fitting to series of data points with curves [1].By this way, the data is denoted by any mathematical function.The fitted curves can be created by regression, splines, or interpolation methods.
In general, polynomials are preferred as mathematical functions and the solution of polynomials can be performed by linear equation.Let M N  X be the input data and 1 M  y be the corresponding outputs.The relation between inputs and outputs is given by linear equations [2] Using matrix notation, Eq. ( 1) is briefly rewritten as where θ denotes the unknown parameter vector.The parameter vector can be obtained by linear algebra, if the input vector X is square (M=N) and non-singular.The linear solution of unknown parameter vector θ is obtained as In case of M N  where the input vector X is singular the T X X becomes non-singular and the solution is obtained by where θ denotes the approximated unknown parameters.The new linear system is to include an error.Therefore, the linear solution equation in ( 2) is extended as below; ˆ  Xθ e y, (5) where e and Xθ denotes the error vector and the estimated output, respectively.This system can be solved by least squares estimation (LSE) method [1][2].The LSE method minimizes the estimation error by using the sum of squared error as cost function where k x denotes the k th input row vector.In linear networks, the unknown parameters can also be estimated by derivative based methods.The gradient vector of parameters respect to error function can be obtained by chain rule in which the optimum values of parameters are determined by iterative algorithms.
Unlike the linear problems, nonlinear problems require high order polynomials.Let denotes the onedimensional input vector and x denotes the output where   f x is a high order polynomial.The relation is defined as Similar to linear systems, the nonlinear system can be solved by Eq. ( 4).In matrix notation, this nonlinear system can also be defined as The LSE method is commonly used in real problems such as, classification, curve fitting, control, system identification, estimation, smoothing, and so on [2][3][4][5][6].The classical polynomial curve fitting uses the nonnegative integer exponents which restrict the working space.However, the errors cannot be sufficiently decreased in this restricted space.If the real exponents rather than positive http://dx.doi.org/10.5755/j01.eee.112.6.460 integer exponents are used in functions, the working space can be transformed from real space to complex space.This new complex polynomial with real exponents is defined as where k  and y can be complex vectors and k p denotes real power.However, this expansion brings new problems.One of them is how to determine the polynomial degrees.The second is how to determine the complex coefficients.The last is how to determine the outputs and constitute the relation between the output and the real world problems.In case of the inputs is negative real numbers and the exponents are also real numbers, the polynomial function outputs can be complex numbers.For example, (-2) 2.5 =0.00 + 5.66i, (-2) 3 =-8, (-2) 3.5 =-0.00-11.31iand (0) p<0 = ∞.These concerns are quite important problems to be considered for curve fitting.Therefore, the input space should either be in positive region or should be mapped to positive regions.Royston and Altman were used the fractional polynomials to fit the some medical data [7].But in their applications, some pre-determined real powers within different curves had been used as constant, and the best curve was selected among the determined curves according to their regression errors.Also, the power ranges was determined within {-2, 2}.
In this study, a hybrid method with genetic algorithm and LSE is proposed for estimating the real positive exponents for curve fitting.

Materials and Methods
A new hybrid curve fitting method, called varying real powers (VRP), is developed in this study.This method uses genetic algorithm for estimating the more accurate real exponents of non-linear systems.To determine the polynomial coefficients, the LSE is employed.The developed method is compared with traditional LSE for estimating the coefficients of polynomials.

Genetic Algorithms
Genetic algorithms (GAs) are derivative-free stochastic optimization methods based on natural selection and evolutionary process [2,8,9].They have global search abilities to determine the optimum parameters.Therefore, GA methods have been extensively used as optimization methods in fuzzy systems, neural networks and support vector machines [2,10,11].
In GAs, the system parameters are encoded with binary numbers that are called as chromosome.The chromosomes constitute population that is regarded as the collection of solution vectors.For better evolutionary process, more than one chromosome is required for each individual parameter.These chromosomes are regarded as parents and used to generate new generation (new possible solutions) by crossover and mutation operations.The new generation is subjected to survivability test by considering their fitness values and survived chromosomes are regarded as parent chromosomes for the next generation.The followings are the steps of genetic algorithm [2,8,9]: 1.An initial population is randomly created from input space.2. Then GA creates a sequence of new populations.At each step, the algorithm uses the chromosomes in the current generation to create the next population.
a) The scores of the current population are determined by their fitness values.b) Select the members using their fitness scores.c) Some of the members in the current population that have lower fitness are chosen as elite.These elite members are moved to the next population.d) The children are created from the parents with mutation and crossover operations.e) The current population is exchanged with the children and so, the next generation is obtained.3. The algorithm is terminated when the number of generations is reached or, the best solution is obtained.
The VRP method follows the below algorithm to fit the given function.1. Set the input range and the number of powers.

for g =1 to generation number
Determine the k p powers, k = 1,2,…,N using GA.Determine the k  coefficients using LSE.
Evaluate the VRP function with Eq. ( 9).end The k p powers can also be adapted by any derivative based methods such as steepest descent, conjugate gradient and so on [2,12].However, there exist boundary problems on computing the gradient vector of powers for derivative based methods.When  , where  denotes the complex numbers.Therefore, GA is more suitable to fit the functions than derivative based methods.

Experimental Studies
The first example has only one exponent.The output function is 0.75 1 2.7 10 y x   , (10) where . This function can be accepted as linear function.
The polynomial curve of the given function is fitted by proposed method and traditional least squares estimation (TLSE) for comparison.In comparison, adaptive neuro-fuzzy inference system (ANFIS) is also used.ANFIS is a well known function approximator based on fuzzy systems.It divides the input space to sub-regions.Each region is defined by first-order polynomials.The obtained results are plotted in Fig. 1 and given in Table 1, respectively.In Table 1, #pow and #par denotes the number of powers and number of adapted parameters of polynomials, respectively.Also, root mean square error (RMSE) criterion is used for comparison.ANFIS produces 4 fuzzy rules and 4 polynomials for 1 y .For that reason, ANFIS polynomials were not presented in any tables.It is observed in Table 1 that the proposed method gives the coefficients and exponent close the original values.Also, the RMSE value of VRP is the best among the methods.It can be seen in Fig. 1 that the curve of VRP is overlapped with the original function.
The second function is defined with two real powers as 6.7 3.49 where . The obtained results are given in Table 2 and plotted in Fig. 2, respectively.According to the Table 2 and Fig. 2, the VRP is best among the compared methods with same conditions.When the degree of TLSE is increased to 7, the TLSE gives promising results.However, TLSE reaches that result with high number of parameters when it is compared to VRP.
The third example is more complex than the previous examples.This function is defined as where . The obtained results are given in Table 3 and plotted in Fig. 3, respectively.According to the results, VRP is better than TLSE.However, the ANFIS is the best.In addition to one-dimensional functions, twodimensional surface fitting problem is also evaluated in this study.These surfaces are formulated as: where are centers of the first and second bowls.The obtained results are given in Table 4 and plotted in Fig. 4, respectively.In Table 4, the TLSE and VRP polynomials with 21 terms were not presented due to its size.Surface fitting is more challenging than curve fitting.But, VRP is the best among the comparison methods according to RMSE values.The performance of VRP is also validated with the same problem with 20 powers and the obtained surfaces are plotted in Fig. 5.In this study, the powers are constrained within -50 and +50 values.The extension of this range may lead us to global solution.However, this will significantly increase the computation time if genetic algorithms were used.This problem can be solved by using parallel programming.
Also it is obvious that the using of correlated powers in curves is to improve the fitting success.In this case, the number of tuned parameters is to increase rapidly.For that reason, only uncorrelated powers are used in this study.

Conclusion
A hybrid tuning method, which combines genetic algorithm and LSE, is proposed to best fit to the objected curves by using varying real powers.In this method, the genetic algorithm tunes the powers of polynomial while the LSE is tuning the coefficients of polynomial.A point needed to argue is that the method generates complex numbers when the inputs are negative and the powers are real numbers.Therefore, only the positive inputs were used in the experiments.Experimental studies show that the usage of real powers rather than positive integer numbers significantly improves the approximation performance in curve fitting problems.The proposed method can be used in a wide number of fields including pattern recognition, estimation, signal processing, and control areas, when the obtained complex results are projected to real numbers.The new polynomial classifiers can be proposed.Some new polynomial feature extraction methods or definition methods as Legendre and Zernike polynomials can also be defined.We hope that the proposed polynomial approximation can be flashed on new methods and ideas.

Table 1 .
Curve fitting results of methods for 1

Table 2 .
Curve fitting results of methods for 2

Table 3 .
Curve fitting results of methods for 3

Table 4 .
Curve fitting results of methods for 4