Classification of Multisensor Images with Different Spatial Resolution

1Abstract—The paper is focused on the analysis of classification possibilities of multisensor data with different spatial resolutions using combined classifiers based on Bayes approach with equal prior probabilities and on minimum of the Mahalanobis distance. The task set up for the 2014 IEEE GRSS Data Fusion Contest was chosen as an application example. High resolution RGB image and lower resolution thermal infrared image from the same urban area were processed to perform classification of each higher resolution pixel. Development of a fast and straightforward procedure was targeted and combined classifiers are proposed for that, exploiting spectral features from each data set separately. It is shown that data fusion can be achieved using the proposed classifiers and improvement of classification quality can be obtained with respect to the cases where only one of the data sets is used. The best classification results were obtained using the combined Bayestype classifier that provided overall classification accuracy of about 95 % when the ground truth pixels from the high resolution RGB image were used both for design and testing.


I. INTRODUCTION
Remote sensing from airplanes and satellites has become a widely used tool for solving tasks in management of natural resources, urban planning, precision agriculture and other areas.Different types of sensors are used for that, including multispectral, hyperspectral, LiDAR, SAR, acquiring different kinds of data.It is a common practice to employ several sensors at once to obtain complementary information from the same area.For example, LiDAR and multispectral data from the same forest area are often collected to obtain information for its inventory.LiDAR data can be processed to obtain a height model of the stand, while spectral data can be used to detect species or assess health of trees etc. Quite often in this case, data from two different sources should be used in a combined way to solve a specific task, i.e. data fusion should be performed during processing.Data from optical sensors are in general acquired in the form of threedimensional images where each pixel is related with its spatial coordinates calculated from simultaneously collected GPS information.Pixel size in this case depends on the Manuscript received January 5, 2015; accepted June 26, 2015.This research was performed within the project No. 2013/0031/2DP/2.1.1.1.0/13/APIA/VIAA/010funded by the European Regional Development Fund. distance to the target, viewing angle and number of sensing cells in the sensor.LiDAR and SAR data are usually preprocessed to obtain images characterizing geographical areas under study and are also registered to geographical coordinates.Pixel size in this case can be chosen in the preprocessing procedure but it is limited by the amount of collected data within the spatial unit.
When data from sensors are obtained with different spatial resolution, their fusion becomes a challenging task.It is also crucial to perform precise registration of acquired images to geographical coordinates.Otherwise pixels of these images cannot be properly related with physical objects observed and their combined use for analysis of these objects cannot be performed correctly.

II.STATE OF THE ART
One of the major tasks in remote sensing is classification of geographical areas or distinct objects represented by acquired images.There are multiple classification approaches developed for that [1].If the data representing each class can be interpreted as a sample realisation from the multidimensional universe with Gaussian distribution, Bayes classification approach is applicable and has shown good results [2].Therefore it is purposeful to consider Bayesian approach to classification of multiresolution data obtained from different sensors.
Classification of multisensor data with different spatial resolutions is usually performed by combining outputs of separate classifiers each dealing with data from one sensor, or designing a single classifier operating with a fused image [3].The first approach is simpler in general and there are multiple studies following it [4]- [10].Approaches to combination of multiple individual classifiers including Bayesian ones were considered by Xu et al. [4] with application for handwriting recognition.Averaged Bayes classifier was proposed to combine results of different classifiers from the same data.Enhanced combination of independent Bayesian classifiers was elaborated in [6], based on original definition in [5].As a result, an iterative procedure was proposed using steps, similar to the expectation-maximization algorithm.Other sophisticated approaches include using of neural networks [7], Dempster-Shafer theory [8] and fuzzy sets [10].Authors of [3] and [11] propose forming of the adequate mathematical models for description of SAR images on the basis of probability integral transforms of natural random values.However, application of classification algorithms obtained in this way has not provided excellent results with chosen training and test sets.Storvik et al. [12] proposed a Bayesian approach for the case when pixels of lower resolution images precisely overlap those from the higher resolution image and pixel dimensions at lower resolutions are entire multiples of the pixel dimensions at the higher (reference) resolution.However, this situation can be achieved only in a single multi-modal sensor.

III. TASK AND GOALS
A more common application case where the data were acquired by two different sensors working in different spectral ranges was considered within the 2014 IEEE GRSS Data Fusion Contest (DFC) [13].High resolution (~0.2 m × 0.2 m pixel) RGB images and lower resolution (~1 m × 1 m pixel) thermal infrared (TI) hyperspectral data in 84 bands with wavelengths from 7.8 μm to 11.5 μm acquired from urban area were presented for this contest.Classification of higher resolution pixels into 7 land cover classes was targeted and ground truth for all was presented to facilitate that.In this paper we propose a classifier exploiting DFC data from both sensors in its classification rule, present and analyse the results obtained using different classifier designs.The following goals were set up:  To design a classifier distinguishing all categories of higher resolution pixels with high precision, i.e. low error rate;  To provide a general classification approach not exploiting specific features of the analysed scene i.e. independent from image specifics;  To propose a solution requiring low computational resources i.e. facilitate fast processing of large data sets.
Only "subset" images of the data set "grss_dfc_2014" presented in the initial stage of the DFC were processed, including all ground truth regions (see Fig. 1).Ground truth regions were presented in a separate image with the same spatial resolution as the RGB image and covered ~17 % of the whole area of the "subset" image.To prepare ground truth for the TI image, only pixels fully included in defined ground truth regions were used, comprising ~12 % of the whole area of the "subset" TI image.

IV. DESIGN OF SEPARATE CLASSIFIERS
To provide a solution of a defined task, a classifier obtained using an appropriate data fusion method should be designed.Our chosen approach presumes two stages: in the first stage, two separate classifiers are designed, each using one of the available images only; in the second stage, these two classifiers are "mated" into a combined one, expecting an increase of precision.

The vector of mean values
  , , was calculated first from the ground truth pixels of this class within the RGB image.After that, covariance matrices for each class were calculated where k c is a number of pixels in the design set of the class k;  x is a column vector of the RGB intensity values of pixel  .
Assuming that intensity distribution of pixels of the class k in the RGB bands is represented by the random vector   actually gives us a point estimate of the covariance matrix of this random vector.
To prepare a classifier for the TI image, 8 spectral bands out of 84 were chosen, featuring lower image noise and taken from different parts of spectral range of the TI sensor.The following octet of TI bands was formed: (4, 14, 26, 36, 47, 57, 69, 78).After that, the same procedure as for the RGB image was applied i.e. a vector of mean values for each class of pixels was obtained and the covariance matrix calculated as follows where k c is the number of pixels in the design set of the class k, formed for the TI image,  y is the column vector of the TI intensity values of pixel  taken from the chosen bands.
With certain credibility we may assume that intensity distribution of pixels of the class k in the RGB image is represented by the random vector with Gaussian distribution k X .The authors of [3] and [11] also consider this distribution as an adequate model for multidimensional optical images.In this case, probability density function for this vector can be approximately expressed by where If the number of pixels that do not belong to any of the 7 classes is negligibly small, we can define two classifiers which ignore these pixels.
We will denote by W a classifier which classifies each pixel in the RGB image as a pixel of class k if and only if the intensity vector x meets the following condition for each j In an analogous way we will define a classifier V which classifies each pixel in the TI image as a pixel of class k if and only if its intensity vector meets the following condition for each j Applying logarithmic operation to ( 5) and ( 6) we can rewrite the classification rules in the following equivalent form: classifier W is a classifier which classifies a RGB image pixel with intensity vector x as a pixel of class k if and only if x Σ (7) for each j.And, similarly, classifier V is a classifier which classifies a TI image pixel with intensity vector y as a pixel of class k if and only if for each j.
Apparently, classifiers W and V are qualified as Bayes type classifiers.Our informative basis here allows to define two other separate classifiers as well.Let us denote by W' a classifier which classifies a RGB image pixel with intensity vector x as a pixel of class k if and only if for each j.Similarly, classifier V' classifies a TI image pixel with intensity vector y as a pixel of class k if and only if for each j.

V.DESIGN OF THE COMBINED CLASSIFIER
Combination of the separate classifiers is performed on the basis of procedure relating each pixel from the RGB image with a pixel from TI image so that the bigger pixel from the TI image includes the larger part of the smaller RGB image pixel.We will name such pixel from the TI image the associated pixel.As the boundary pixels of the ground truth polygons may have associated pixels only partly corresponding to the defined ground truth areas, they are eliminated from these polygons using the morphological erosion and not used for design of the combined classifier.
Let us define the classification rule of the RGB image pixels.We will denote by U a classifier of RGB image pixels which classifies a RGB image pixel  with intensity vector x as a pixel of class k if and only if for each j, where a y is the vector of the TI intensity values of the pixel associated with pixel  .Classifier U' is obtained by replacing the rule (11) with rule ( 12) for each j.

VI. RESULTS AND CONCLUSIONS
Testing of classifiers W, W', V, V', U, and U' on the basis of the design sets provided the results presented in Table I.To obtain comparable results, morphological erosion introduced for design of the combined classifier was applied for design of all classifiers.Kappa coefficient is based on Cohen's kappa measure and characterizes classifier's quality in comparison with the ideal classifier.Classification results of individual pixels obtained using the best classifier U are visualized in Fig. 2.
As it is seen from the table, Bayes type classifiers provide better overall accuracy than their counterparts based on Mahalanobis distance minimum principle in all cases.In addition, combined classifier provides better quality measures than individual ones.Therefore we can conclude that the proposed combination approach of separate classifiers is fruitful.
In Fig. 2 one may notice that obtained pixel classification errors are mainly related with intermixing of vegetation and tree classes as well as erroneous classification of roads or concrete roofs as grey roofs.Such errors were somewhat expected due to spectral similarity of these classes and more sophisticated classification approaches should be used to distinguish between them.Analysis of the classification results lead to conclusion that the assumption about the Gaussian distribution of the intensity vectors of pixels within a land cover category is a sufficiently adequate model of the real processed data.
Notwithstanding the already achieved, classification accuracy can be probably improved by forming subclasses within categories where the Bayes type classifier provides relatively worse results.This could be planned as future work.Another possible way to improve classification results could be more thorough investigation of informativeness of the spectral bands in TI image.However, in this work we focused on development of the data fusion principle for classification and mentioned improvement possibilities were left out of its scope.
The authors of paper [12] have obtained good classification results by designing classifiers based on more sophisticated mathematical model, namely Markov random field theory, describing distributions of the intensity vectors of pixels.However, it cannot be substantiated that their approach should be followed in particular application case.Their assumption about the pixel sizes and locations in images from different sensors is true in specific case which cannot be related with our task.In addition, Cohen's kappa measure achieved by these authors for real test data is not higher than 0.9 and it is obtained for a different system of pixel categories.The combined classifier proposed in this paper seems much simpler and straightforward for implementation.

1 .
Illustration of processed multisensor data: (a) part of the RGB image; (b) the same part of the band 10 image acquired by the thermal infrared sensor, with ground truth for 7 classes.
intensity distribution of pixels of the class k in the chosen 8 bands of the TI image is represented by the random vector k Y with Gaussian distribution, probability density function for this vector can be approximately expressed by

Fig. 2 .
Fig. 2. Classification results of individual ground truth pixels obtained using the combined Bayes-type classifier U.
y is the square of the Mahalanobis distance

TABLE I .
CLASSIFICATION RESULTS.