Round-Trip Delay Estimation in OPC UA Server-Client Communication Channel

In this paper an estimation of round-trip delay (RTD) in OPC UA server-client channel was investigated in various data communication networks including Ethernet, WiFi, and 3G. Testing was carried out using the developed IoT gateway device running OPC UA server and remote computer running OPC UA client. The server and the client machines were configured to operate in Virtual Private Network powered by OpenVPN. Experimental analysis revealed that RTD values are distributed in the wide range exhibiting difficult-to-explain outliers significantly exceeding average RTD value. A preliminary exploration of the correlation between instantaneous load of communication gateway processor and RTD peaks was carried out on ARM Cortex A8 Texas instruments processors running at 600 MHz and 800 MHz clock frequency. DOI: http://dx.doi.org/10.5755/j01.eie.22.6.17229

1 Abstract-In this paper an estimation of round-trip delay (RTD) in OPC UA server-client channel was investigated in various data communication networks including Ethernet, WiFi, and 3G.Testing was carried out using the developed IoT gateway device running OPC UA server and remote computer running OPC UA client.The server and the client machines were configured to operate in Virtual Private Network powered by OpenVPN.Experimental analysis revealed that RTD values are distributed in the wide range exhibiting difficult-to-explain outliers significantly exceeding average RTD value.A preliminary exploration of the correlation between instantaneous load of communication gateway processor and RTD peaks was carried out on ARM Cortex A8 Texas instruments processors running at 600 MHz and 800 MHz clock frequency.

Index
Terms-Round-trip delay; server-client communication; processor load.

I. INTRODUCTION
Merge of automation networks and data communication networks takes place in the Internet-of-Things (IoT) architectures.An important role in the IoT architecture is dedicated to a so called IoT communication gateways (CGW), that are aimed to ensure connectivity between heterogeneous field networks (industrial buses, wireless, machinery, building automation, sensors networks, etc.) and IP based wide area data communication networks.Some of the described IoT gateway architectures are dedicated to the conversion between two connectivity protocols, for example Zigbee to TCP/IP [1], CAN to GPRS [2], CAN to TCP/IP (WiMAX) [3], while other IoT gateways are multiprotocol devices especially at the field side [4], [5].
A CGW exposes access to field data sinks and sources towards Cloud hosted services.Mostly two technologies Simple Object Access Protocol (SOAP) and Representational State Transfer (REST) are discussed for web services provision [5], [6].As a standard for application level protocol the OPC UA standard (Open Platform Communication Unified Automation) (https://opcfoundation.org/) is expected to play a noticeable role in IoT [7].OPC UA protocol was designed for machineto-machine industrial interoperability and now is well established in industrial automation field.OPC UA is also a cross-platform SOA (Service Oriented Architecture) for process control.
Compared to the REST approach, SOAP style OPC UA features a standardized security architecture, availability of mature commercial software development kits (SDK), data modelling tools (address space), and wide acceptance in the field of industrial automation.Advantages of REST approach are easier implementation, less requirements on the performance of embedded system (resource constrained).These advantages could be crucial for resources limited IoT nodes like remote sensors, but are less critical to infrastructure devices like CGW.Recently described CGW implementations are based on quite powerful processors like ARM 9 [1], [8]- [9], ARM Cortex M4 [10], Intel Atom/Quark SoC processors [11], etc.
The goal of our research is to investigate achievable round-trip delay (RTD) between OPC UA server and client in different data communication networks like Ethernet, WiFi, 3G.Range of RTD parameter is needed to identify possible application areas and characteristics of processes that may be monitored or controlled using CGW connected to Cloud services by means of OPC UA standard at application level.

II. RELATED WORKS
Performance of OPC UA server-client communication channel is investigated by Salvatore Cavalieri, et al. [12], [13].Authors used round-trip delay and delay/sampling interval to quantify the performance of OPC UA.Factors influencing the channel performance are: 1. OPC UA protocol and efficiency of its server and client software implementation, 2. OPC UA server and client settings, 3. Underlying networks and used protocols, 4. Computing performance of target systems hardware running server and client applications.Cavalieri, et al. [13] focused on OPC UA performance estimation using simulation environment, this way abstracting from implementation of server and client SDK, target hardware, underlying communication networks (transport layer), etc.The main finding of their research is that OPC UA subscription profile delay/sampling interval and bandwidth utilization is highly dependent up on settings of subscription service (mainly Publish Interval).In our experimental performance testing we run developed software on the CGW and communicate data over a real communication infrastructure.Specification of the test environment in this case is much more challenging due to big number of factors influencing status of communication network load (3G, WiFi, Ethernet).However, physical experiments enable us to produce real estimates of the channel throughput and to verify the influence of CGW hardware and software implementations.

III. COMMUNICATION GATEWAY ARCHITECTURE
We have implemented CGW based on Variscite Systemon-Module (SoM) with Linux Debian operating system.Variscite SoM contains Texas instruments Cortex-A8 processor running at 800 MHz clock frequency, 512 MB Flash memory and 512 MB DDR3 SDRAM, and on board WiFi communication module supporting access point (hot spot) and client modes.
At the software level CGW runs OPC UA server, which was developed using Softing GmbH OPC UA SDK for Linux.OPC UA client was developed using Softing GmbH OPC UA client for Windows SDK.OPC UA binary mode is used in all testings presented in Chapter V.
Virtual private network (VPN) was established in order to assign non-global IPv4 address to CGW.For this purpose, OpenVPN clients were installed on both server (CGW) and client workstation.OpenVPN server was configured on Linux machine connected to Internet through Gigabit Ethernet.VPN was also responsible for channel securing.In order not to duplicate channel data encoding, OPC UA data encryption was disabled.
Figure 1 presents a diagram describing interconnection of OPC UA server and Linux processes denoted by Protocol adapters and responsible for accessing data in field networks.The communication between them is implemented using Linux inter-process sockets technique.For testing purpose OPC UA server address space containing 30 nodes of double type was created.A node value is updated by simulated data from Protocol adapter every 10 ms in order to simulate data generated by a process under monitoring.It should also be noted that every node in OPC UA address space is automatically supplemented by two time stamps (source and server) each of 8 bytes size.

IV. TEST METHODOLOGY AND ENVIRONMENTS
Round-trip delay tR is defined as a time interval between moments of data request by client and response reception by the same client.
Test environments comprising the CGW, OpenVPN server, IP gateway, and remote workstation running OPC UA client were setup as specified in Table I and shown in Fig. 2-Fig.4. The presented network configurations are very common for connecting IoT gateways (like CGW) to TCP/IP Wide Area Networks (WAN).Being connected to WAN the CGW can access Cloud services, for example, to store collected data in SQL or BigData databases.The developed CGW prototype was initially dedicated to facilitate connectivity between precision agriculture sensors or machinery and Cloud services.However, IoT gateway is a general purpose infrastructure device, which can be used in variety of applications in the areas of smart cities, building automation, healthcare, smart grid, etc.The CGW maintains Ethernet, WiFi and 3G communication channels for connectivity towards WAN.The motivation of VPN utilization was already disclosed in Chapter III.We have opted such scenarios in order to investigate both single in field acquired quantity delivery to remote client and also a middle size set of field quantities, for example read from machinery buses.To collect samples of RTD for its statistical estimation, 1000 requests were generated by the client.RTD was measured by the client software using Windows API QueryPerformanceCounter function, which retrieves the current value of the target system's high resolution (1 µs) time stamp.In Fig. 5 and Fig. 6 histograms of RTD samples are shown.It is interesting to note that in case of 3G-VPN environment (Fig. 5) RTD increases, when 5 s delays are introduced between requests.This feature might be important to consider when remote field process events must be detected fast.In opposite, when only data logging is required, RTD is of less importance because data may be buffered in OPC UA server's address space together with precise time stamp of acquisition moment.Then buffered arrays of data can be delivered to OPC UA client for nonreal time inspection.From Fig. 6 it is seen that in case of WiFi-VPN environment the probability distribution function (PDF) is close to exponential and contains a considerable amount of very high RTD values (column "More").This phenomenon was also observed in all the rest test environments reported in Table II, except scenario D. Because these outliers of RTD can heavily affect the worst case data delivery delay from the observed process, it is of interest to identify the reason of their appearance.Firstly, we examined test environments using ping utility from TCP/IP stack.Delays reported by the ping utility did not indicate presence of any considerable outliers of ping packets RTD.Secondly, we raised a hypothesis that random RTD increases are due to insufficient CGW CPU performance.In chapter VI we give some preliminary results of CPU load observations.It also should be noted that Linux scheduler allocates 10 ms time slots for each active software process (thread) execution.Therefore, every RTD value fluctuation in the range of 10 ms can be due to process switching activities in CGW operating system.

VI. ROUND-TRIP DELAY AND CPU LOAD RELATIONSHIP
To investigate a relationship between server's CPU load and OPC UA channel's RTD, the recording of CPU load was carried out during the testing.Results presented in this chapter were acquired using reduced clock frequency CGW processor (600 MHz instead of 800 MHz), in order to investigate the suspected insufficiency of the processing performance.A sample plots of obtained results are presented in Fig. 7.
Though the recorded RTD and CPU load patterns intuitively seem to correlate, the calculation of Pearson correlation indicated rather weak relationship.CPU load fluctuates due to the various activities of internal Linux processes.We assume that until a certain level the CPU load influence upon RTD is minor.Therefore, we defined the following hypothesis: RTD peaks exceeding the least observed RTD value by N times, are related to the events of CPU load exceeding threshold level K.In order to verify this hypothesis the experimentally acquired RTD and CPU load patterns (Fig. 7 1, ( ) ( ) 1, ( ) 0, otherwise .
Then the Pearson correlation coefficient between these sequences was calculated and presented in Table III (PEA columns).In all cases (N and K value combinations) the significance level (p-value) of hypothesis of no relationship between sequences is less than 0.05, indicating that the corresponding correlation is significant.Similarity between binary patterns alternatively can be estimated using variety of measures [14].We have selected to calculate Kulczynski-II index (KUL columns).This index represents conditional probability that the peak is present in RTD patterns, given that the peak is also present in CPU load pattern.It can be seen from Table III, that Kulczynski-II index is quite close to Pearson correlation coefficient.Therefore, we assume that the larger are appearing peaks of RTD pattern the higher is probability they are due to the increased load of CPU.

VII. PROFILING OF CPU LOAD
The approximate profiling of CGW CPU load due to separate software processes was carried out by reading Linux kernel system statistics files proc/stat (http://man7.org/linux/man-pages/man5/proc.5.html), that provide an ongoing look at processor activity in operating system.The CGW software then added the corresponding readings to its OPC UA server's address space.In particular, load of CPU due to Linux process executing OPC UA Server, Protocol adapter, VPN client, WiFi driver and total CPU load were recorded during the testing.From a typical profiling charts (Fig. 8 and Fig. 9) it is evident that OPC UA server process contributes most significantly to the overall CPU load.Other processes like Protocol adapter, VPN client, WiFi driver are less demanding for processor performance.From Fig. 9 we see that processing performed by OPC UA server constitute nearly the whole CPU load in Ethernet-VPN environment.In case of the WiFi-VPN setup some processor resources are occupied by WiFi driver.Therefore, during WiFi-VPN setup testing RTD was larger (see Table II), resulting in less requests arriving from the client per time unit.That means OPC UA server handled less requests in WiFi-VPN case compared to Ethernet-VPN.The whole CPU load was more sporadic in WiFi-VPN case (Fig. 8), perhaps due to WiFi driver execution by the CGW main processor.Load of CGW CPU decreases significantly (see Fig. 10), when 150 ms delay is introduced between client requests.Demand for such an update rate (compared to full speed interrogation) is probable for example in applications targeting machinery generated data retrieval from CAN bus and delivery to OPC UA client.
A more thorough investigation and Linux processes profiling is required to identify reasons of RTD outliers appearance in OPC UA data delivery channel.

VIII. CONCLUSIONS
In WiFi and Ethernet based networks OPC UA channel average round-trip delay achievable using 800 MHz ARM Cortex A8 processor running Linux operating system is in the range of several milliseconds per one OPC UA node.Grouping OPC UA nodes is beneficial, because RTD increases less times than the number of nodes is increased (Table II).In 3G network average RTD of OPC UA channel was around 250 milliseconds and little dependent on the number of nodes delivered to client (from 1 to 30 nodes).
In all networks RTD exhibited sporadic outliers exceeding average value up to more than 20 times.Except from 3G-VPN environments we suspect that these outliers were caused by temporary overload of CPU (100 % load).Number of such outliers was significantly higher when main processor was running at 600 MHz compared to 800 MHz clock frequency.
OPC UA server is a computational performance demanding application.In case full speed interrogation by OPC UA client is executed, 800 MHz ARM Cortex A8 processor was only enough to handle OPC UA server execution without affecting round-trip delay in data transfer channel.
(a) and Fig. 7(b)) were transformed to the corresponding binary sequences:

Fig. 8 .
Fig. 8. CPU load by CGW software processes profile during WiFi-VPN test case C scenario testing.

Fig. 9 .
Fig. 9. CPU load by CGW software processes profile during Ethernet-VPN test case C scenario testing.

Fig. 10 .
Fig. 10.CPU load by CGW software processes profile during WiFi-VPN test case D scenario except delays between requests were 150 ms testing.

TABLE I .
TEST ENVIRONMENTS SPECIFICATION.
1. 1 double type node request at max speed, 2. 10 double type node request at max speed, 3. 30 double type node request at max speed, 4. Same as C except 5 s waiting delays between adjacent response and request were introduced by the client.

TABLE II .
RTD ESTIMATION RESULTS.

TABLE III .
RTD PEAKS AND CPU LOAD CORRELATION.