Log-polar Transformation as a Tool for Text Skew Estimation

—The paper proposed the method for text skew detection based on log-polar transformation. The original image is transformed in the log-polar domain as well as the control ellipse. Theirs cross-correlation established the cost function. The extraction of the cost function maximums gives the text skew value in the left and right region from the centre point of transformation. The method is suitable for the printed text. It is characterized by the accuracy and computational time inexpensiveness.


I. INTRODUCTION
The printed text is characterized with an articulated regularity in shape [1].It means that the letters have the similar sizes.The distance between text lines is generally adequate.The inter line spacing is sufficient to split text lines.The orientation of the text lines is look-alike.It forms uniform text skew.All listed attributes represent relatively predicted characteristics.They simplify the skew estimation of the printed text.It should be noted that the text skew occurrence is unavoidable.It is an implication of the digitization process.However, its existence could cause the optical character recognition system (OCR) failing.Hence, the text skew estimation represents the crucial step in OCR [2].
This paper introduces a new algorithm for text skew estimation.It is based on the co-operation of the two methods, log-polar transformation and cross-correlation.First, it converts two images, i.e. the original image and the image with referent text skew, in log-polar space.Furthermore, both images are cross-correlated in the logpolar domain in order to extract theirs similarity.As a result, the cross-correlation function has been obtained.This cost function represents the similarity measure between the images [3].Hence, its maximum values give the optimum matching.The function's horizontal axis represents the angle of the image rotation.This way, its maximums identify the text skew of the image.The proposed algorithm shows a good skew estimation in the standard image resolution of Manuscript received March 12, 2012; accepted May 14, 2012.300 dpi.It is computationally inexpensive, too.This is a consequence of the binary image processing.Hence, this method is a promising one.
Organization of this paper is as follows.Section 2 describes briefly the principle of the log-polar transformation.Section 3 proposes the algorithm for the estimation of the text skew based on log-polar transformation.Section 4 explains the steps of the algorithm.Section 5 defines text experiments.Section 6 discusses obtained results.Section 7 makes conclusion.

II. LOG-POLAR TRANSFORMATION
Log-polar transformation maps points from the image, i.e.Cartesian space to the points in the log-polar parametric space.It reduces data vector depending on position the in the image.Consequently, it is based on the eye retina of the primates [3].First, Cartesian space coordinates x and y are converted into polar coordinates radius r and angle θ.Their mapping is as follows [3]: arctan .
Furthermore, the log transformation is obtained as Hence, the (x, y) coordinates from Cartesian space are mapped to (ρ, θ) coordinates in log-polar space.

III. ALGORITHM
Document text image represents the image, which is a product of the image scanning process.It is a digital greylevel image given by matrix D. It consists of M rows, N columns, and L intensity levels of gray.L is the integer from {0,…,255}.Hence, D(i, j) ∈ {0,…,255}, where i = 1,…,M and j = 1,…,N.After the binarization process, the image is transformed into binary one, i.e.B. Its elements are equal to 0 or 1.If D(i, j) ≥ D th (i, j) then B(i, j) = 1, else if D(i, j) < D th (i, j) then B(i, j) = 0. Accordingly, D th (i, j) represents the global threshold sensitivity decision value [4], [5].
Currently, document text image is given as binary matrix B featuring M rows and N columns, and two levels of intensity: 0 and 1.
Log-polar transformation is a nonlinear and non-uniform sampling of spatial domain.Nonlinearity has been introduced by polar mapping, while non-uniform sampling is the result of logarithmic scaling [6].Consider the log-polar coordinate system, which denotes the radial distance and the angle from the centre.For the input binary image B, the centre point of transformation has been extracted as B(m c , n c ).The radius is assigned as R. It ensures the maximum number of pixels that have to be included within the reference circle of the conversion.Centre of the circle is given as m c = M/2, and n c = N/2 [7].Furthermore, the image is converted into the polar coordinate system.This way, the binary image B has been transformed into the polar domain (r, θ), where [6]: and i = 1, …, M, j = 1, …, N. Furthermore, log-polar transform is given as (ρ, θ) where ρ is obtained from (3).In the log-polar domain matrices of text image B and referent object E are marked as BC and EC.Cross-correlation is a similarity measure between two images.Its function is ( ) where ECS is circshift(EC, θ) and C coeff (BC, ECS) is given as If the images are more alike, then cross-correlation function CC(θ) will tend to approach 1.
The identification of the rotation in spatial domain, i.e. image space is a complex task.However, the rotation in logpolar space is mapped into translation.Translation in direction of one axis is an easy task to solve.Suppose that a referent object is rotated in a space domain.If it is crosscorrelated with text image for different angles, then it will be readout as translation in log-polar space.First, the objective is the selection of the correct referent object.The right choice can be an ellipse.It is a suitable object because it can overlap text efficiently.However, the ellipse has to be normalized according to the text image dimension.Furthermore, it is split into left and right part from the centre point of the transformation.This way, parts of the ellipse are matched with the original image by cross-correlation.Hence, they establish left and right skew estimation.Unlike the other methods, proposed algorithm identifies two skews: left and right one.This fact is the advantage of it.

IV. ALGORITHM PROCEDURE
The algorithm for text skew estimation based on the logpolar transformation consists of the following steps: 1) Text image extraction by a bounding box (text image).
2) Identification of the centre point for log-polar transformation.
3) Creation of the binary image with the normalized ellipse (ellipse image).Step 2. Transformation centre point B(m, n) is extracted according to the pixel density at the centre of the bounding box.
Step 3. The ellipse is created as the binary image.The size of the ellipse depends on the original text image, which means that it is normalized.The ellipse is shown in Fig. 2. From ( 4)-( 6), log-polar transformation of the text and ellipse image has been achieved.
Step 4. Fig. 3 shows log-polar transformation of the text image.
Step 5. Fig. 4 shows log-polar transformation of the ellipse.
Step 6.The cross-correlation of the text and ellipse image in the log-polar domain results with the cross-correlation cost function.It is shown in Fig. 5. Step 7. From Fig. 5, cost function has two maximums (around 0° and 180°).These maximums represent two angles of the text skew given from the transformation centre point (left and right).Furthermore, information has been returned from the log-polar domain to the spatial image domain.Excerpt from the cross-correlation cost function for left and right part from the transformation point is shown in Fig. 6.
Step 8.As a result, left and right skew angle lines extracted from the cost function are drawn.Fig. 7 shows these lines.

V. EXPERIMENTS
The main task of the experiments is the evaluation of the algorithm's ability to estimate correctly the text skew.Experiments were performed mostly on synthetic data sets.They represent single-line samples of the printed text [9].The text is rotated for the angle θ, from 0° to 60° by 5° steps around x-axis.It is shown in Fig. 8.
where RE(θ) represents a relative error.

VI. RESULTS AND DISCUSSION
The result of testing is given in Table I in Appendix.The algorithm shows good results in the whole range of testing angles.Hence, proposed method is promising in the domain of the text skew accuracy.Furthermore, it is computer time non-intensive.

VII. CONCLUSIONS
The paper introduced the method for text skew estimation based on log-polar transformation and cross-correlation.It estimates the similarity between text image and ellipse in the log-polar domain.As a result, the cross-correlation cost function is obtained.Its maximums give text skew left and right from the transformation centre point.Proposed method shows good results for skew estimation of the printed and hand printed text.The future investigation will exploit the estimation of skew for the handwritten text.APPENDIX A

6 .
Cross-correlation cost function: for the left ellipse (a), and right ellipse (b) with text image.

Fig. 8 .
Fig. 8.A printed text sample which is rotated up to 60°.Furthermore, all text samples are given in the standard resolution of 300 dpi.The results are evaluated by absolute deviation, i.e. error.It is given as:

Fig. 10
Fig. 10 shows RLHR.It has a very good value in the whole range of testing angles.

TABLE I .
TESTING RESULTS.