Optimization of Image Processing in Video-based Traffic Monitoring

The video-based traffic monitoring systems have been widely used. The system usually reads real time monitoring video and converts it into images for processing. However, such systems are often limited by image processing algorithms and don’t behavior as well as expected. We hereby propose optimization approaches for of image processing. As in image processing, getting a binary image is usually a fundamental step, we first present an adaptive thresholds approach for binary conversion. The approach takes into consideration the space information of the pixel and chooses thresholds by adaptively according to each pixel and its neighboring pixels. Then we introduce a three-dimension Gaussian filter, which has best quantity-time tradeoff, to remove noise in the image. Although widely used, background subtraction is limited by background refreshing. We propose a generative model approach that is based on Gaussian model and Gaussian distribution, to generate background so that we can update background at any time. We also add in moving objects shadow detection and removing mechanism in moving objects segmentation. In real world monitoring, removing disturbs from burst noise is a hard problem. We, taking advantage of the characteristic that most of the burst noise is sudden and short-term, put forward a burst noise eliminating algorithm that uses several continuous frames to wipe off such sudden noise. Particularly, we studied the characteristics of H, S and V component of reflection light band, and assert that removal of the reflection light band is to eliminate negative effect from high-energy reflection light. We use Gaussian model and Sobel operator to achieve reflection light band removing; we also as utilize Canny algorithm to wipe off edge corrosion. Finally we achieved an integrated optimized solution on traffic monitoring system by making a tradeoff between time and effect. DOI: http://dx.doi.org/10.5755/j01.eee.18.8.2634


I. INTRODUCTION
The rapid development of image processing facilitates the application of video-based monitoring systems in different domains and areas.Video-based traffic monitoring system, a typical kind of widely used monitoring system, can achieve automatic surveillance by video obtained from monitoring system [1].However, many video-based traffic monitoring systems, although their convenience and low cost, don't perform as well as expected.
Most video-based traffic monitoring systems involve image processing, including image graying, binary conversion, and denoising [2].Image processing is an essential part to the monitoring system.Optimizing image processing is able to improve monitoring effect.We hereby focus on optimization of the video-based traffic monitoring system, mainly in image processing optimization.We will talk about optimization for image processing in different phrases of processing routine.In the first section, we present an adaptive thresholds approach for binary conversion.Next, we will introduce a three-dimension Gaussian filter to remove noise in the image.As a most popular approach, background subtraction, although being effective, is usually limited by background real time detection.We take advantage of a Gaussian model to generate background with which we can refresh the background dynamically at any time.We also put forward a burst noise eliminating algorithm to remove its interference.Finally, we use a novel approach to erase reflection light band which tends to disturb monitoring badly.

II. BINARY CONVERSION WITH ADAPTIVE THRESHOLDS
As color information of an image is not so much of use, images are usually converted to binary ones.Image binary conversion is so important that sometimes it largely determines affect final result.In a binary image, typically black and white are used for each pixel.White is often used to represent foreground objects and black is for background ones.
The simplest method of binary conversion is by threshold [3].Each pixel is set to 1 if its gray value is not less than some threshold or otherwise set to 0, as (1) where g(x, y) is the gray value of pixel (x, y).We can see that threshold is a key to carry on image segmentation.An ideal threshold is the one that remains as much feature information as possible [4].Adaptive threshold, which take the position information of the pixel into consideration, is a better solution than a single predefined threshold.We carry out image binary conversion with adaptive thresholds method, as follows [5].
Step 1: Segment the image into foreground part and background part with (2).Set each pixel of which grey value is less than threshold 1 to 0; otherwise it will be set to 1 where h(L i ) represents the occurrence of the gray value L i , and P(L i ) is its occurrence probability.
Step 2: Get threshold 2 by (3).Set each pixel of which gray value is less than threshold 2 to 0; otherwise set it to 1: , where f(x,y) is the gray value of pixel (x,y) and θ is minor sign less integral.
Step 3: Get an adaptive window A of pixel (x,y).Get a local adaptive threshold 3 .Set each pixel of which gray value is less than threshold 3 to 0; otherwise it will be set to 1 where M is the total pixel number of A.
The following two figures are converted by threshold.The left one is processed by a single predefined threshold method and the right one is processed by our adaptive thresholds method.We can see the segmented objects in the right image are distinct, but they are fuzzy in the left one, especially for the vehicle in the center of the image.a) b) Fig. 1.The left one is processed by conversional single threshold method and the right one is processed by adaptive thresholds method.Obviously the right one is more distinct than the left one.

III. DENOISING WITH THREE-DIMENSION GAUSSIAN FILTERING
Binary converted images usually have lots of noise.Hence we need image denoising to reduce negative effect caused by noise.There are many commonly used algorithms, such as neighborhood averaging, median filtering and Gaussian filtering.

A. Neighborhood Averaging Filtering
Neighborhood averaging filtering (NAF) [6] assigns average gray value of a pixel and its neighborhood pixels to the pixel, as (5)   , 1 , ( , ), where S is coordinate set of neighborhood pixels, and M is total number of pixels in S. NAF approach works good in most cases but it tends to make image fuzzy.

B. Median Filtering
With median filtering (MF) [7], gray value of each original pixel will be replaced with the median value of a predefined window W. (6) shows how to get median value.MF approach can keep details of image and protect edge from being fuzzy.However the image is still possible to get fuzzy.

C. Gaussian Filtering
Gaussian Filter [8] is a linear smoothing filter whose weight relies on shape of Gaussian function.Gaussian filter works especially efficient in removing the noise with normal distribution (Gaussian distribution) [8].One-dimension Gaussian function is as (7) where sigma determines width of Gaussian function.

D. Three-dimension Gaussian Filtering
In our experiment [9], we use a three-dimension Gaussian filter (3DGF), with the kernel function: , where W is the width of Gaussian function.Data context window [-3×sigma, 3×sigma] covers most filter coefficient.We tested filtering using the three approaches with different parameters.Table I shows the processing time, and Fig. 2 shows filtering effects.As we could see, GF had least processing time and 3DGF took second place.As for the effect, 3DGF, the best one, identified the cars as well as lines of traffic lanes in the center of images; while two cars and some part of the traffic lanes were so indistinct in resulting image by GF.In general, 3DGF had the best cost efficiency.

IV. DYNAMIC BACKGROUND REFRESHING
Background subtraction is a practical approach to detect moving object in video monitoring [10].However, in practice, environmental factors will disturb monitoring.For example, sun light illumination variation and different weather conditions make environmental backgrounds so different.As a result, conventional background subtraction without background refreshing periodically and dynamically usually does not work well in real world.We here introduce a dynamic background refreshing method so as to alleviate impact from environment.

A. Background Detection
We find the background of monitoring video, taken as a whole, tends to change gradually within a certain time period rather than suddenly and sharply.The changed pixels generally pertain to a Gaussian distribution with mean value μ and variance σ.We hereby utilize Gaussian function to model different states of a pixel throughout all time periods.Consequently, given time t, the states of pixel (x, y) are a time sequence vector.Each pixel is subject to Gaussian distribution , ; 1, , , where I is an image sequence and (x, y, k) represents the pixel (x, y) of frame k.We use the model to fit the new coming frame of pixel.Given a pixel, if the values of H (hue), S (saturation), and V (brightness) components [11] are all between μ and σ, it will be set as a background pixel; otherwise it is a foreground one.
However, in HSV [11] model, if the value of S component is less than some threshold, the image will be regarded as a monochromic one.As a result, H component becomes not so useful and color precision of the image reflected by component of S drops with decreasing of V component value.Hence, we can see that H component is only involved in some dedicated cases, S component is used to exam whether the image is chromatic or monochromic, and V component is essential all through the processing.
Let H 1 , S 1 , V 1 be the H, S, and V component value of a pixel respectively; let H 2 , S 2 , V 2 be the H, S, and V component value of a new coming pixel respectively; let S T be a threshold of saturation; let σ be the standard deviation.We can use the following criterion to examine whether the coming pixel is a background one or a foreground one.There are altogether four kinds of cases: Case 1: if both S 1 and S 2 are less than S T , we will check whether variation of V component is not more than variance σ.If true, the pixel is regarded as a back one.
The condition and criterion are showed as follows: Condition:

S S AND S S  
; Criterion: 2 1 VV  .Case 2: if S 1 is less than S T and S 2 is greater than S T , we will check whether both variation of V component and variation of S component are not more than variance σ.If true, the pixel is regarded as a back one.
The condition and criterion are showed as follows: Condition: T T

S S AND S S  
; Case 3: if S 1 is less than S T and S 2 is less than S T , we will check whether both variation of V component and variation of S component are not more than variance σ.If true, the pixel is regarded as a back one.
The condition and criterion are showed as follows: Condition: After background detection, those unmatched pixels will be kept unchanged and those matched ones will be updated by the (9): where σ min is a threshold of noise, α represents the model learning rate, and x denotes the components of H, S and V.
As the initial distribution of pixels is unknown, we set the component values of the pixels in first frame to be the mean value of Gaussian distribution and variance to be zero.After that we save values of H, S and V components, as well as corresponding mean values and variance values.Then we will compare the values with the saved old values.Finally we will update the ones which have the same values as the saved old values, and leave alone others.

B. shadow detection
The shadow of the moving objects will make object separation component incorrectly detect object along with its shadow as moving object.If the subsequent processing is based on detected moving object, it will result in mistake [12].Hence, we need to detect and remove the shadow of moving objects.We find that compared with the object part, the shadow part is darker and has less color.By this criterion, we put forward an approach to detect and remove the shadow of moving objects.We compare the value of the background pixel with that of shadow pixel.If the variation of H component value and the variation of S component value are both less than some given threshold, it can be determined as a shadow pixel.The criterion is showed as follows.
That is, give a pixel (x, y), if The left two figures are snapshots from monitoring system in urban road and in highway.The right figures are corresponding greyed image after separating moving objects and removing it shadow.We can see that the moving objects are effetely identified and separated; the shadows are also removed.

V. BURST NOISE ELIMINATING
Burst noise, such as flying bird and falling leaves, is an unexpected and sudden interference from outside and usually exists for a very short time span [10].Due to the characteristic that most of the burst noise merely exists in several continuous frames, we will take advantage of several anterior and posterior frames to remove the burst noise.First, we get subtraction of current frame F n and previous frame Fn-1, namely Dif t .Then we get subtraction of current frame and current background, namely Dif b .We can see that Dif b includes current moving objects and noise.As the time span between two continuous frames is very short, the burst noise would generally exist in both frames.Therefore we can remove the burst noise by their subtraction.Thus, Dift only has relative moving information.After that, we get moving objects and background frames Dif t and Dif b .The algorithm that segments moving objects from image with burst noise removing is shown as follows.
Step 1: Get frame subtraction Dif t Step 3: Given pixel (x, y), if Dif b (x, y) and Dif t (x, y) are both 1, it is regarding as a moving pixel and will be set to 1;otherwise it will be set to 0.
Step 4: Repeatedly deal with each pixel in the frame with step 1 to step 3.
Step 5: Get total movement value by summing up the results returned from step 4.
Step 6: If the total movement is greater than the predetermined threshold, it will be treated as a moving object; otherwise it is a non-moving one.

VI. REFLECTION LIGHT BAND ERASING
Many objects, such as ground and water, can reflect light.Yet these reflections can be ignored as they perturb monitoring so slightly.However, light band, which is caused by highly reflective objects such as glass, is a kind of clustered reflection light with high energy.It will disturb monitoring very much [2], [5], [9], as it is often incorrectly regarded as moving objects.Here we will introduce an approach to remove the reflection of highly reflective objects.
We have found that a light band has an obvious edge with symmetrical brightness, which is the part with highest energy in reflection light.After binary conversion, the variation of light around light band will become more distinct.We can see the distinguishing change of brightness in Fig. 5. Indeed, it is neither environment change of light, nor shade, and nor burst noise.Few of existing background updating algorithms and shadow eliminating algorithms can deal with the reflection light band [1], [2], [10].In our approach, we regard removal of the reflection light band as eliminating high-energy part of image.We have found that the low-beam of car headlamp is asymmetric on both sides of the reference axis while the full-beam of car headlamp is symmetric.In fact, the brightness of reflected light is with a gradient distribution.In particular, the gradient distribution of the part that is not affected by the reflection is uniform and similar to the gradient distribution of the background.
Detecting the edge of light band is an important and fundamental step of light band eliminating.There are many operators for edge detection algorithm, such as Canny operator [12], Sobel operator [13], and Prewitt operator [13].As the Canny edge detector uses a multi-stage algorithm, although usually returning good results, it is more time consuming than some other ones [14].Both Sobel operator and Prewitt operator are discrete differentiation operators.The Prewitt operator uses averaging filtering [13] to compute an approximation of the gradient of the image intensity function; while the Sobel operator uses weighted filtering [13] to get an approximation of the opposite of the gradient of the image intensity function.
Taking into account of brightness variation and real-time processing requirement, we here use improved four direction Sobel operator [15] to add weight to the pixels near the center and the pixels near the edge highlight so as to get the gradient approximation of brightness function, thus improving accuracy and fastening the operating speed.The improved four direction Sobel operator for convolution operation are given in Table II.We utilize the above Sobel operators to get each gradient component of V components of background image and pending image, as (12): where g x, g y , g l , and g r are gradient components for x-direction, y-direction, left diagonal direction and right diagonal direction.We use g x, g y , g l , and g r to get x and y direction gradient and diagonal direction gradient, as (13): As brightness gradient distribution of the light caused by diffused reflection is different from the one caused by full reflection, we can then we can test whether a given region is affected by full reflection light, as (14) , where G t denotes the testing image, and G b represents the background.When a full reflection light comes, we examine the variation between V component gradient of the region and that of the background.As the energy distribution of the full reflection light rapid declines with the expansion of range of irradiation, while at the same time the diffuse reflection light declines in a slower speed rate, along with directional characteristics of the full reflection light, it is more effective to separate background and foreground by using gradient difference of the two directions than by using gradient of single direction.
As we have known it will cause a large variation of the background saturation if the target light is projected to the background.We conclude that the saturation of the testing image, S t , is greater than that of the background one, S b .Adding in white light will reduce color saturation.When using background subtraction to extract moving objects, we take advantage of the above judgment rule.If the following conditions are satisfied, we can consider the image is affected by full reflection light rather than moving object .
Fig. 6 shows us the effect of reflection light band erasing.The left one is image of Fig. 6 is the binary converted image and the right one is the corresponding generated greyed image.We can see that most of disturbing reflection light band is eliminated.a) b) Fig. 6.The image that is after reflection light band erasing.
However, we can see from Fig. 6 that although most of reflection light band is removed, the edge pixels that are near the light band are also corroded.To solve the problem, we use Canny algorithm [12] to detect the edge of the reflection light band.The left figure of Fig. 7 is the edge detected by Canny before reflection light band erasing; the right figure of Fig. 7 is the edge detected by Canny after reflection light band erasing.Fig. 7 shows us that Canny algorithm can deal with edge corrosion.

VII. CONCLUSIONS
A video-based traffic monitoring system must be capable of working in various weather and illumination conditions.However, it has a lot of prerequisites and constraints.In this paper, we focus on optimizing image processing in video-based traffic monitoring.
We first improve the threshold-based image binary conversion approach.Different from conventional approach, out adaptive threshold take into space information as well and proves to be a better one.Then we make a tradeoff between time cost and effect, introducing a three-dimension Gaussian filtering denoising approach.We use a Gaussian model to generate background, so that we can refresh background any time, thus throwing away a major limitation of background subtraction approach.At same time, we also introduce method to detect and remove shadow of moving objects so that the separated moving objects will not be disturbed by their shadows.
As burst noise will often interfere monitoring, we, taking advantage of the characteristic that most of the burst noise merely exists in several continuous frames, put forward an algorithm to remove burst noise by several anterior and posterior frames.Aiming at eliminating reflection light band, we utilize the characteristics of H, S and V component of reflection light band and conclude that removal of the reflection light band is eliminating high-energy part of image.Based on the rule, we propose an approach to remove reflection light band.
The experiments show that the optimization approaches and algorithms prove to improve effeteness and efficiency of system.

3 .
 , we will set S component value of (x, y) to 1; otherwise we will set it to 0.Here T s and T h denote the thresholds of S component and H component respectively.As the V component value of the pixel in shadowed area is generally less than that of the pixel in unshadowed area, γ and β are less than 1.Meanwhile β is used to measure the strength of brightness.The greater β shows the brighter the light the pixel has.By this, we can remove the shadow of moving objects.The following figures show us the effect of getting separate moving objects with shadow removing.The left figure is snapshot from monitoring system in urban road.The right figure is the greyed image after separating moving objects and removing it shadow.a) b) Fig. 4. The left figure is snapshot monitoring system in highway.The right figure is the greyed image after separating moving objects and removing it shadow.

Step 2 :
Get subtraction of current frame and background Dif b.

5 .
The image with reflection light band.

7 .
The left figure is the edge detected by Canny algorithm before reflection light band erasing; the right figure is the edge detected by Canny algorithm after reflection light band erasing.

TABLE I .
PROCESSING TIME OF THREE APPROACHES WITH DIFFERENT PARAMETERS.SIZE OF TESTING IMAGE:704×576.CPU:INTEL CORE I5-2300 Case 4: if both S 1 and S 2 are greater than S T , we will check whether variation of V component, variation of S component, and value of S 2 multiply cosine the variation of H component are not more than variance σ.If true, the pixel is regarded as a back one.

TABLE II .
IMPROVED FOUR-DIRECTION SOBEL OPERATOR FOR CONVOLUTION OPERATION.