Realization of Instrument for Environmental Parameters Measuring

1 Abstract —As the effects of global warming are spreading all over the planet, the activities on monitoring the environment are getting on importance and are becoming the focus of many projects, in particular those targeting development of smart cities and smart societies in general. Development of technology is enabling implementation of low-cost environment monitoring systems that can be installed across the cities utilizing the existing city infrastructure (for example attached to lampposts or installed on public transport vehicles) or even carried around by individuals contributing to crowd sourced based monitoring solutions. However, even though the costs are significantly lower than just a few years ago, it is still prohibitive to deploy such systems on a very large scale. This paper presents a possible approach towards significant reduction of the number of measurement points as well as the number of required sensors while still providing reliable estimations of environment parameters across an area. The approach is based on the mathematical statistics methods. It is shown that the proposed solution works well in the areas with stable and slowly-changing weather conditions, i.e. in cases when there are no sudden climate changes.

1 Abstract-As the effects of global warming are spreading all over the planet, the activities on monitoring the environment are getting on importance and are becoming the focus of many projects, in particular those targeting development of smart cities and smart societies in general.Development of technology is enabling implementation of low-cost environment monitoring systems that can be installed across the cities utilizing the existing city infrastructure (for example attached to lampposts or installed on public transport vehicles) or even carried around by individuals contributing to crowd sourced based monitoring solutions.However, even though the costs are significantly lower than just a few years ago, it is still prohibitive to deploy such systems on a very large scale.This paper presents a possible approach towards significant reduction of the number of measurement points as well as the number of required sensors while still providing reliable estimations of environment parameters across an area.The approach is based on the mathematical statistics methods.It is shown that the proposed solution works well in the areas with stable and slowly-changing weather conditions, i.e. in cases when there are no sudden climate changes.
Index Terms-Environmental monitoring, gas detectors, remote monitoring, statistics, MATLAB.

I. INTRODUCTION
It is well-known that global urbanization significantly contributes to the climate changes.Nowadays, urban population constitutes almost a half of the world population.It is estimated that a city of one million citizens produces around 25,000 tons of CO2 and about 300,000 tons of waste waters every day.It is considered that normal CO2 concentration in the atmosphere is about 540 mg/m 3 , and the estimates are that it will reach 1600 mg/m 3 by the end of this century, if no remedial actions are undertaken [1].
The environment monitoring domain is to a large extent driven by national legislations, while compliance with the said legislation is monitored using a relatively low number of devices located at a few locations in each city.The topology of the locations where the monitoring devices are deployed, the choice of the environment parameters that are monitored, the sampling rates etc. are defined according to the relevant legislation as well as international agreements, such as the Air Quality Guidelines issued by the World Health Organization (WHO) [2].
The number of the measuring points depends on the area that is being monitored, types of the air-pollution sources, geographical features (terrain configuration) and the population density.The purpose and the goal of the monitoring also determine the topology of the network of the air quality monitoring devices.In case of a large, flat, densely populated area, measurement points can be distributed uniformly.The density of the measurement points does not have to be larger than one measurement point per 10 km 2 if the order is geometric or one point per 25,000 citizens if the criterion is based on the population density.
While such scattered network provides a macro view of the pollution, i.e. the general level of pollution in a region, it is not capable of providing more fine grained spatial resolution of environment parameters which due to various reasons on micro level (busy junctions, small factories etc.) can represent danger to human health.
The measurement devices providing data accurate to the level required by the environment monitoring legislation are very expensive.Therefore, it is not feasible to deploy such devices in large numbers across a region to achieve the more grained spatial resolution.Technological advances in embedded electronics, sensor technology and mobile communications have made it possible to implement relatively cheap devices that can be deployed on fixed locations as well as on vehicles, thus effectively realizing a mobile monitoring network [3].The accuracy and the reliability of the sensors used in these devices are lower than in the legislation compliant devices, but they still can provide good indications of air pollution.
Measurement points should be densely deployed in the city centres (higher concentrations expected), and less and less dense towards the suburbs.The rules for setting up an urban environment monitoring network are not universal and are impacted by the following elements: the most common building type in the area, the prevalent type of heating used (fossil fuels, natural gas, electricity, etc.), the type of public transport in use, existence and location of heavy industry [3].
While the new generation environment monitoring devices are significantly cheaper than their predecessors, it is still not feasible to cover every corner, every street, park or a building with a dedicated device.For that reason, in this paper we describe the potential of using statistical analysis of the measured data using interpolation curves to decrease the number of measurement points.This reduction will lead to reduction of the number of sensors and consequently to reduction of the overall cost of deploying and running a citywide environment monitoring system.The goal of this work is development of a mobile measurement device as an instrument for monitoring of air quality parameters Our system contains larger number of sensors than similar devices reported in [4]- [6].A device with similar characteristics is described in [5], but measurement station is static and has no possibility of remote data reading.Moreover, our device can be installed on vehicles (public buses, police, taxis, etc.) and has the ability to measure air pollution parameters as devices reported in [4] and [6] with an additional feature that allows generating a pollution map for environment monitoring system of a city.In the urban areas, the air-pollution dominantly originates from waste gases of the internal combustion engines.Hence, it was assumed that strong correlation exists between CO and NO2 gases, which are products of burning fossil fuels.
Correlation between two or more variables, in our case correlation of the observed values of monitored air pollution parameters has not been often observed [7].In order to draw such conclusions, the methods of causative correlation and regression have been used [8].By using this theory and a regression model (interpolation and extrapolation) it has been shown that it is possible to decrease the number of both measurement points and sensors as by observing values of one parameter (for example CO), it is possible to determine values of another (for example NO2).

II. THE LEAST SQUARES METHOD
Numerous mathematical models aiming to provide an accurate description of the relationship between various parameters that affect air pollution [9] exist in literature.One of the most common methods for data fitting is LSM (Least Squares Method) [8], based on the principle of reducing a small square rest.As a measure of discrepancy, the following empirical formula with the total of (k+1) parameters b0, b1,…, bk is used where From the experimental point of view, it is more suitable to take a square sum of the discrepancy where b -parameter vector, b = [bi], i = 0, 1,..., k.
According to the least square method, the best (optimal) values of the parameters b0, b1,…, bk in the selected empirical formula (2) are those for which the discrepancy square sum has the minimum value where i = 0, 1, 2,…, k.
Generally, (4) are non-linear.If there are more solutions to the observed system, that is multiple local function minimums exist, S(b0, b1,…, bk), the choice is the solution that gives the least minimum value -the global minimum.
If we assume that the data being analysed have linear dependency, by applying the least squares method the coefficients a and b can be determined with the function being really linear and having the form of f(x) = ax + b: Referring to ( 6) and (7) the equation system is derived: .
After the transformation of ( 8) and ( 9), the coefficients a and b can be calculated as follows: / Using the previous expressions we can define the variances σ 2 x and σ 2 y which represent the mean discrepancy square.A co-variance cov(x, y) represents the measure of the strength of correlation among the variables: If the co-variance is 0, it means that there is no linear relation between the variables.If cov(x, y) > 0, then y tends to change in accordance with x and then there exists a direct linear correlation.If cov(x, y) < 0, then y tends to decrease when x raises, and vice versa.In that case there exists a negative or inverse linear correlation [8].Co-variance does not take into account different degrees of variability of separate variables, and it does not take into account measurement units.In order to provide the comparison, covariance needs to be scaled so that it gives the same numerical values for the same correlation degree between two variables, regardless of the order of magnitude of the variables, and regardless of the measurement units.A common approach to achieve that is to divide the covariance by the product of standard deviations of the variables.The value gained in such a way is called correlation coefficient and is defined as follows 2 2 cov( , ) , The correlation coefficient r is used for measuring the degree of linear dependency between two variables.Its values range from -1 (when there is a complete negative or inverse correlation) to +1 (when there is a complete positive or direct correlation).The value r = 0 points out the lack of linear dependency.Correlation is considered as negligible if the absolute value of r is less than 0.3; it is medium strong when r is between 0.3 and 0.7, while it is considered strong when the value of r is over 0.7.However, it is important to stress out that when the value of correlation coefficient is 0, it does not necessarily means the lack of any correlation between two variables, because the relationship between them may be non-linear.Unlike co-variance, correlation coefficient is not expressed through units of a measurement [8].
The significance of a correlation coefficient r, depends on the sample size and on the value of r.In order to test significance level of correlation coefficient r the following equation for computing the appropriate t value for t distribution is calculated [ where N is the number of samples.Based on the calculated t value, using the Student distribution, value of significance of the correlation coefficient P is obtained.The coefficient of correlation is significant if obtained value P is P < 0.05 [8].

III. IMPLEMENTATION OF THE MEASUREMENT DEVICE
In order to use the above described statistical method for experiment with environmental parameters, first it is necessary to identify the most significant polluters in the area being observed, and then to identify the pollution types [10].For this purpose it is possible to use the existing, stationary measurement stations which have already been deployed in urban areas.If that is not possible, it would be required to use mobile measurement stations to make a snapshot of situation on the terrain.Because we focused on the urban areas where the primary pollution source is traffic, i.e. the internal combustion engines we decided to make mobile station for measurement environmental parameters which can be install onto vehicles of public transport.With some correction on power source this device could be used as a stationary measurement stations.Figure 1 presents overall system architecture of our environment monitoring system.Our device is able to measure concentrations of the following gases in the air: H2S, NO O2, SO2, CO2, atmospheric pressure, ambient temperature and air humidity.A detailed explanation of the selected electrochemical sensors including their usage, advantages and disadvantages are given in [11]- [17].The sensors used for measuring the CO and the NO2 concentrations are TGS2442 [18] and MiCS-2710 [19] respectively.These sensors were chosen in order to avoid cross sensitivity which can have detrimental impact on the measurements.Both sensors belong to a group of gas sensors working on the gas concentration driven change of output electrical resistance principle.Both sensors have a fast response and a high sensitivity.The overall sensor control logic including collecting and transferring measurement data is implemented on a dsPIC30F4013 microcontroller connected to a GPRS/GPS modem Telit GM862.A block scheme of an implemented system is presented in Fig. 1, while the actual device is presented in Fig. 2. Sensor measurements are sent to a supervision centre, i.e. a server implementing adequate logic together with an underlying database.The server processes and stores data permanently, also, visualization of the results is done based on specific requirements of end-users.Systems with similar hardware architecture are described in [20], [21].In this paper, the added feature is ability to control and monitor the system via a mobile phone as [22], by sending appropriately formatted SMS messages.It is possible to configure parameters of the device via SMS: start operation, stop operation and change the IP address of the server.There are systems that are based on one or more stationary stations as described in [23]- [26].Our system is based on one mobile measurement device which further improves performance of the system.

IV. AIR POLLUTION MEASUREMENT RESULTS
The measurements were conducted over a period of one month at different locations.We decided to use data from one stationary device and three mobile devices.In cooperation with the public transport company of the city of Pancevo, the devices were installed onto city buses [27].The results from all devices are in a central database.Since we have three mobile stations installed on buses which can send current GPS location and the concentrations of the gases, we had the opportunity to see the conditions of the environment in different parts of the city.Three different regions of the city were selected for monitoring: an area around the city centre, an area around the central bus station and an area near one of the main traffic crossroads.In the following text we present the results of measurements.
Analysing the parameters of the environment, it was evident that there is correlation between the CO and the NO2 gases (Fig. 1.).Since both gases dominantly originate from the process of fossil fuels (oil) combustion, we wanted to investigate if there is a mutual correlation between their concentrations.If this assumption is proved to be correct, it would allow us to approximate the values of one gas based on the measurement values of the other gas.A similar approach to data processing can be seen in [7].To be able to see that there is significant overlap between the CO and the NO2 concentrations measurements, the measurement results are processed through the statistical methods described in Section II.
In the first case, data from the stationary device was analysed.Graphs in Fig. 3 show the measurements gathered over a 24-hour interval, with 1-hour sampling interval.Regression straight line with data from the static measurement device is shown in Fig. 4. The regression straight line has the form of y = 55.86 × x + 1.46 while the correlation coefficient is r = 0.85.The large value of this coefficient indicates a very big interdependency of these gases.The CO-NO2 correlation graph depicts that the regression is r = 0.85 i.e. the efficiency of CO production per mol of NO2 was 85 % at this area of city.
In all three graphics presented in Fig. 5 the measurements results were gathered at more than 650 points.In the first graph presented in Fig. 5  The measurement results obtained through the statistical methods described in Section II are compared to the results obtained directly from the stationary measurement devicethe resulting linear regression line and the correlation coefficient are bigger than 0.7.With this analysis, the correlation between the concentrations of the CO and the NO2 gases is verified.Figure 6 gives a layout of the measurements obtained from the devices mounted on top of the public buses in order to simplify deployment and cover a large area.The use of mobile environment monitoring devices significantly reduces the price of the complete system.However, the drawback is that the measured data are collected during different time intervals, at different locations.This allows monitoring of the situation in the areas with slowly-changing weather conditions, i.e. when there are no sudden climate changes.In this case we don't have real time results at each location.There is a possibility to miss detection of the sudden changes of some parameters.Using the data generated by the mobile measurement units and the regression data analysis, it is possible to the concentration of monitored gases across the region.Based on the measurements obtained from monitoring devices we've formed the matrix of data for individual parts of the city, i.e. from the areas where the buses had passed during the measurement campaign (Fig. 6).Based on this data matrix, extrapolation of values for the neighbouring areas is done using Matlab.In this way estimation is done for the values of neighbouring locations where we do not have exact measurements (areas surrounded with areas where exact measurements are performed).Results are presented in Fig. 7.If we compare images Fig. 7(a) and Fig. 7(b), we can see similar colours on the same locations.This is another proof that under certain conditions (when there are no sudden climate changes), one of the sensors can be left out when measuring concentrations of CO and NO2.
In our measuring device, signal acquisition functions and mathematical analyses of the obtained values are put together in a whole.It has also been made possible to have a high quality diagram of analysis results.

V. CONCLUSIONS
The implemented measurement system enables multiple savings without degradation of the measured values.Locating the most polluted places and the places with basic pollution types create important advantages.Firstly, it is possible to decrease the number of needed sensors by mathematical calculation of the coefficient of correlation among some types of gases, since by following one pollution type it is possible to statistically reproduce other gas values.It means that it is not necessary to buy all the kinds of sensors on all of the measuring stations.Secondly, by using the regression equations obtained from the numerous measuring, it is possible to define gas values even at the locations where there not any measuring stations, but those values can be mathematically approximated.To avoid setting a large number of measuring stations all over the area, it is possible to collect the data using one mobile system that has all required types of sensors.Doing it this way, i.e. combining real measurements and the statistic based results, it is feasible to obtain satisfactory results without violating measuring quality.This combination provides savings in the reduced number of sensors, as well as in the decreased number of measuring stations.By calculating contour lines it is achievable to graphically, in a quality way, present the degree of air-pollution on the whole city territory.
This enables situation observing at the stationary working modes, whereas further investigations would go in the direction of obtaining accurate values both during abrupt atmospheric changes, and substantial concentration changes.

Fig. 1 .
Fig. 1.Block scheme of an implemented environment monitoring system.

Fig. 2 .
Fig. 2. Layout of an implemented measurement PCB with a microprocessor, sensors, and a modem.

Fig. 3 .
Fig. 3. Measurement results gathered over a 24-hour interval, with 1-hour sampling interval at one locations.

Fig. 4 .
Fig. 4. Regression straight line for data from static measurement station.
(a), the CO-NO2 correlation graph is linear y = 26.434× x + 16.391 and results regression r = 0.88 (r 2 = 0.7902), significance of a correlation coefficient P < 0.001, i.e. the efficiency of CO production per mole of NO2 was 88 %.This result also shows a strong correlation between the concentrations of CO and NO2 gases.In the second graph presented in Fig5(b), the CO-NO2 correlation graph is linear y = 8.7714 × x + 8.7971 and the resulting regression is r = 0.7 (r 2 = 0.489), significance of a correlation coefficient P < 0.001 i.e. the efficiency of CO production per mole of NO2 was 70 %.This result also shows a strong correlation between the concentrations of CO and NO2 gases.In the last graph presented in Fig.5(c), the CO-NO2 correlation graph is linear y = 33.169× x + 31.56 and results regression r = 0.8 (r 2 = 0.6352), significance of a correlation coefficient P < 0.001, i.e. the efficiency of CO production per mole of NO2 was 80 %.This result also shows a strong correlation between the concentrations of CO and NO2 gases.

Fig. 5 .
Fig. 5. Regression straight line has the linear form and correlation coefficient more than 0.7.The results collected from mobile stations.

Fig. 6 .
The layout of the measured gases concentration value, with the precise GPS measuring location (a) NO2 (b) CO.

Figure 6 (
a) and Fig. 6(b) show the concentrations of NO2 and CO.Minimum concentration values are indicated by the green dots, medium concentration by the yellow dots and high concentration is marked with the red dots.

Figure 7 (
a) and Fig. 7(b) represent the concentration of gases in the part of the city in the form of a contour obtained using Matlab.The areas shown in dark blue are the areas where the gas is not measured or is rather low.The areas shown in red are the areas with the highest concentration of gases.