Kalman Filter for Hybrid Tracking Technique in Augmented Reality

,


I. INTRODUCTION
Most of the research work in augmented reality (AR) field investigates problems that are related to tracking and interaction techniques.Using tracking techniques virtual content can be represented in appropriate orientation and position, while changing users' perspective, and interaction techniques allow users to manipulate with virtual content.Nowadays implemented object tracking solutions for AR are often based on computer vision (CV) techniques such as SIFT, FERNS, SURF, FAST or other similar methods and their modifications [1]- [5].Several tasks must be taken into consideration using augmented reality systems that are based on computer vision techniques: 1. Locate and track object in the scene.2. Display virtual content depending on a trackable object orientation and position.
3. Ensure an opportunity to interact with content.
While tracking object in real environment conditions several problems must be solved: illumination level, image transformations because of different camera perspective, image quality, reflection and partial or full occlusion at the same time.As a result, less permanent features are detected and matches are found between different viewpoints of the same scene to accomplish reliable object tracking.Virtual content is not displayed, if the tracking is lost.Image processing speed is also critical aspect for AR and must be accomplished in real-time.
Depending on AR application field ultrasonic tracking technique [6] is a solution for position tracking.However, it is limited to workspace and no orientation estimates are provided.Currently object motion tracking using digital inertial sensors is an active research topic that are analysed in [7], [8] works.Fast and irregular camera movement causes tracking errors and instabilities in case of computer vision.These problems can be solved using inertial sensors to estimate rapid camera orientation changes.Camera tracking using inertial sensors is a suitable method because of a high speed measurement acquisition.However, it is important to maintain stability after a longer period of time.Sensors are affected by noise, drift and magnetic interference.By integrating and combining several sensor information disadvantages of separate sensors can be eliminated using sensor fusion solutions [9]- [11].In this way reliable orientation estimates are provided.
This paper describes position and orientation estimation problems, which is critical in the field of augmented reality, and proposes a way to improve it.In this research no specific computer vision method was proposed.However, assumptions were made that object can be recognized and tracked in an image at 20 frames per second (has a disadvantage in speed).Wrong or unavailable orientationposition estimates using computer vision tracking can be improved or supplemented with additional estimates using sensors.Orientation and position estimation is the main aspect, which is analysed in this work.To achieve better orientation-position estimates Kalman filter for hybrid tracking technique was designed.Computer vision (CV) and sensor fusion (SF) tracking information were simulated.

II. OBJECT TRACKING USING SENSOR FUSION
Orientation estimation using digital sensors is an active research topic.Digital accelerometer measures acceleration and gravity of the device.In general case, this sensor is suitable to estimate an orientation of the object, depending on Earth's gravitational force.According to accelerometer acquired measurements, orientation vector can be estimated by using (1) where  is estimated by using (2) where t x  , t y  , t z  -gyroscope angular rate at current time moment t; -estimated orientation at earlier time moment t-1; t  -time between measurements.
Estimated orientation using gyroscope angular rate measurements are accurate only in a short period of time.After a longer period of time drift exponentially increases and errors are accumulated without any fixed reference system.For augmented reality tracking system, it is a critical aspect.
Heading  b from magnetometer can be determined using the following (3)-( 5): cos sin sin cos sin , where x m , y m , z m -magnetic field measurements from magnetometer;  ,  corresponds to  a and  a estimated orientation using the accelerometer (1).One of the magnetometers disadvantages -magnetic field measurements are affected by distortions, which are caused by ferromagnetic objects.These distortions can be compensated.
A combination of an accelerometer, gyroscope and magnetometer digital sensors measurements are used to ensure reliable object orientation in 3D space.Orientation drift using gyroscope is eliminated with accelerometer, which contributes in correcting  and  angles.Magnetometer ensures corrected  heading measurements in combination with accelerometer.This is the main idea for sensor fusion using quaternion representation.Quaternion requires less calculation time, provides reliable orientation estimation and maintains stability compared to Euler angles or rotation matrix representation.Quaternion q is a four element vector (6) , , , , , , , q q q q q w xi yj zk   where -rotations in respect to x, y and z axis.Quaternionbased algorithms for orientation estimation using sensors are explicitly analysed in [8], [9], therefore, no detailed analysis are provided in this work.Systems that uses such orientation estimation techniques are not limited to motion, specific environment, place or occlusions, therefore, it has advantages over computer vision techniques.
Even though quaternion representation of orientation is more reliable, for simplicity purposes in further hybrid tracking experiments Euler angles are used instead of quaternions.Euler angles 2 2 tan 2 2 1 a q q q q q q e q q q q q q q q q q Position estimation is a difficult, but possible task using accelerometer measurements data.As mentioned before, accelerometer output jointly measures acceleration and gravity, therefore, it is necessary to eliminate gravity to get linear acceleration.If measurements from sensor array (accelerometer, gyroscope and magnetometer) are available, then gravity vector can be estimated using orientation represented in quaternion ( 8) x y z g q q q q g g q q q q g q q q q where 0 q , 1 q , 2 q , 3 q -quaternion elements that represent orientation; x g , y g , z g -accelerometer gravity direction with respect to each axis.Estimated orientation using quaternion representation must be accurate.Even small errors in orientation estimation that are used to calculate gravity vector can cause large errors in linear acceleration.Linear acceleration vector g a with eliminated gravity g from accelerometer output can be estimated using (9) .
Using linear acceleration g a , velocity t v and position t p vectors are estimated for each axis using ( 10) and ( 11) expressions: where t  -time between measurements.As accelerometer measurements are affected by noise, it cannot be efficiently eliminated for position estimation.For further experiments estimated orientation and position-velocity-acceleration vector sf x using sensors are modelled using (12) expression , , , x

III. OBJECT TRACKING USING COMPUTER VISION
Computer Vision methods use feature extraction (detection) and description as a primary analysis aspect to find interest points in an image and matching to perform object tracking.This allows estimating relative orientation and position of the camera using RANSAC probability method.From geometric perspective it can be explained by using pinhole camera model, which is widely used and analysed in computer vision researches.
IV. KALMAN FILTER FOR HYBRID TRACKING Two main aspects are important from user's perspective using AR system: 1. System must operate in real-time without any delay.2.Even in rapid movement and occlusion tracking must be robust without any jittering.While representing virtual content in augmented reality environment these aspects must be ensured with no interruption using hybrid tracking.Dynamic motion measurements from sensors are used to improve provided information from computer vision method.Conceptual diagram of data acquisition and processing from sensors and general purpose camera is presented in Fig. 1.This is a general model for hybrid tracking technique.The following assumptions for further hybrid tracking technique must be taken into consideration: 1. Calibration of camera and sensors are accomplished offline. .6. Sensors attached to the camera moves in the environment, not the object that can be tracked in the scene using CV. 7. CV method provides relative information about camera orientation-position. Accordingly, CV and SF orientationposition information are simulated from the common starting point.The last assumption is the main aspect for orientationposition estimation, because information does not coincide when sf x and cv x vectors are taken separately.For instance, in case of CV relative camera orientation-position information is estimated from the object that is found and tracked in the scene and sensor fusion solution provides this information directly from a camera motion.
After taking assumptions into consideration, Kalman filter (KF) is applied for simulated orientation-position information.KF algorithm is recursive and widely used in object trajectory prediction, control, tracking, collisionwarning systems, image processing, sensor fusion etc. KF consists of prediction (process model) ( 16 where k x ˆstate prediction vector affected by noise k w ; k x -update vector; ~-respectively process prediction and measurement update independent Gaussian noises; Astate transition matrix; k P ˆ, k P -respectively prediction and update state covariance matrix; respectively independent process and measurement noise covariance matrices; H -measurement matrix; I -identity matrix.In the update step the difference between measurement and prediction states are compensated and new estimates determined.KF convergence rate depends on Q and R values; decreased value of Q or R shows confidence in either process or measurement steps.
Analysing parts.From here some matrices are denoted by superscript "o" for orientation and superscript "p" for position or positionvelocity-acceleration vectors to avoid confusion: Two different KF perspectives are used for orientation and position estimation.CV has slower data processing rate, therefore, available faster SF information is used for prediction between CV samples.According to the described orientation and position estimation scenarios KF is applied using prediction and update equations (16-20).In orientation estimation scenario three orientation states  ,  ,  are estimated using KF for hybrid tracking.Simulated camera orientation (  angle) using SF and CV are presented in Fig. 2.
Initial tracked object orientation is 90 degrees.Processed data are acquired at constant time intervals from camera and sensors.Similar orientation results would be obtained for  and  angles.Camera orientation is simulated in rapid motion.Even though most of the time orientation using CV is available, in some cases, tracking is improperly estimated that causes spikes or it can be lost (red crosses) because of the occlusion.
Fig. 2. Simulated SF and CV orientation with modelled outliers and unavailable orientation segment using CV.
Kalman filter for hybrid tracking starts after the orientation information is acquired using CV method and used as an initial state 0 x (starting point) in prediction equation.Update x sensor fusion orientation vector.In general case, better confidence is assigned to available CV information (Q for process), rather than SF (R for measurement): Posterior update state k x using SF measurement k z is used in prediction step with the same confidence as CV information, if orientation vector 1  k x using CV is unavailable for state prediction k x ˆ.Also, the following conditions are introduced to determine the confidence, when wrong CV orientation vector is acquired: where TH z -determined 5 % threshold parameter.Innovation comparison with k z takes place as measurements from SF are always available.If the error in CV is critical, innovation exceeds a threshold of k z measurement, therefore, better confidence is provided to measurement.In real-life conditions, existing errors in CV provides considerable differences comparing to SF orientation information.If CV information is available without any errors, innovation does not exceed a threshold, as it provides similar results to SF orientation.Estimated orientation using Kalman filter for hybrid tracking presented in Fig. 3. Spike in CV orientation is successfully eliminated using KF.The same applies to unavailable orientation information from CV. Innovation results in case of orientation are presented in Fig. 4.
Spikes in innovation are errors from CV orientation information and are taken into consideration while estimating orientation using Kalman filter for hybrid tracking.
In position estimation scenario only one coordinate is tracked.For position estimation SF information is used in process model and CV information in measurement model.To avoid increasing difference between process and measurement estimates, modifications are made in prediction equation ( 16) by adding control vector: 1 0 0 0 0 0 , 0 0 0 where B -control matrix; k u -control vector 4  n -four last available measurements from CV.By adding control vector, which consists of modified innovation m k y ˆit is ensured that position estimation from SF will not diverge between CV samples.Modified innovation is calculated using previously available k z measurements using CV and current estimated position from sensors, until the next available CV position information.Estimated position using Kalman filter for hybrid tracking presented is in Fig. 7.
The main demand for AR hybrid tracking technique is in case of a lost object tracking using CV to provide camera orientation-position information from additional sourcesinertial sensors.Summarized results of the accomplished experiments are presented in Table I and Table II.
GT denotes ground truth.In presented orientation-position estimation results KF for hybrid tracking shows the best results: mean and standard deviation is nearest to the ground truth value.In case of root mean square error in orientation estimation the error is smallest from 90 degrees (an original state in object tracking) and in case of position it is near zero value, which is reliability indication of estimated results.Other researches related to the tracking approaches does not provide their experimental results, nor ground truth values in a numeric form to make objective comparisons.As a result, all experiments were accomplished with simulated data.Improvements using Hybrid tracking are presented in Table III.In case of slower camera motion speed, orientation estimation using CV holds reliable estimates compared to fast motion, when object recognition and tracking in the image is difficult or impossible.In case of CV "+/-" denotes additional conditions: whether the object in the scene is recognized or not.Unavailable or occluded object in the scene is a possibility.SF orientation accuracy is not affected by motion speed.In case of SF "+/-" denotes unreliable position estimation.Kalman filter for hybrid tracking balances between reliable and unreliable estimates from CV and SF estimates.Sign "-" shows that object position and orientation tracking is not possible.It is important to note that computer CV and SF must be implemented on different threads.This allows to avoid system delay and accomplish data processing independently.Even though this research focus on a hybrid tracking technique, solution can be easily adopted for the development of interaction devices.
plane.Mapping from 3D to 2D is called perspective projection and can be expressed using (13): where x f , y f -focal length; x c , y c -optical centre of the camera.Intrinsic parameters are used to remove distortions.The extrinsic parameter is a transformation matrix in camera coordinate system, which consists of rotation matrix 3 3 R for orientation and translation vector 1 3 T for position.Computer vision techniques focus on estimating this fundamental transformation matrix.Rotation matrix can be converted to Euler angles using (14) expression:

3 .Fig. 1 .
Fig. 1.Kalman filter for Hybrid tracking: orientation and position estimation using computer vision and sensor fusion techniques.

3 .
Hybrid tracking for orientation estimation.

pA
(23) also known as constant acceleration model (for process model) is used to calculate only 1-axis vector of position, velocity and acceleration.For the other two axes the same model should be applied as well.In case of measurement part, interest holds only in position.Velocity and acceleration are not observed, therefore p H is used errors according to (26) equation.Simulated noisy acceleration and estimated position are presented in Fig. 5.

7 .
Hybrid tracking for position estimation.

VI. CONCLUSIONS 1 .
Orientation-position estimation approaches were analysed based on sensor fusion and computer vision.Hybrid tracking solution using Kalman filter was proposed that has supplementary properties and eliminates separate tracking technique disadvantages.2. Orientation and position information were simulated with probable errors that might arise in real life conditions, caused by rapid motion, occlusions and other problems.Hybrid tracking research estimated results were compared with only computer vision and only sensor fusion estimates.3. Depending on the available information from sources, different approaches to orientation and position estimation problems are proposed incorporating innovation conditions.Proposed solution improves tracking reliability and eliminates delay in computer vision provided orientation-position estimates.4. Hybrid tracking technique adoption in augmented reality is the main aspect of this research.Orientationposition improved estimates would ensure reliable virtual content representation into real world context with realistic continuous view and interaction.

TABLE I .
MEAN (  ), STANDARD DEVIATION (  ), ROOT MEAN SQUARE ERROR RESULTS FOR ORIENTATION ESTIMATION.

TABLE III .
SUMMARY OF HYBRID TRACKING IMPROVEMENTS.