Dynamic Repartitioning of Large Data Model in Distribution Management Systems

Nagrinėjamos modernios pasiskirstymo valdymo sistemos, naudojancios multiprocesorines sistemas didelio duomenų kiekio modeliui efektyviai apdoroti. Sio tyrimo tikslas – gauti optimalią procesorių apkrovą atminties panaudojimo ir skaiciavimo trukmės atžvilgiu. Vykdymo metu atliekamas dinaminis perskirstymas, kai aptinkamas issibalansavimas. Aptarti difuzinio perskirstymo ir Cut-Paste algoritmai. Be to, atliktas minėtų algoritmų modifikavimas siekiant pagerinti dinaminį perskirstymą, atliekamą multiprocesorinėse sistemose. Pasiūlyti algoritmai buvo pritaikyti didelio galios tinklo modelio duomenims. Eksperimentai parodė procesorių apkrovos issibalansavimo sumažėjimą ir nasumo padidėjimą. Bibl. 12, lent. 2 (anglų kalba; santraukos anglų ir lietuvių k.). DOI: http://dx.doi.org/10.5755/j01.eee.120.4.1461


Introduction
Modern large-scale critical infrastructure systems (such as electric power, water and gas management systems) are faced with a continuous increasing amount of data that should be processed and calculated.One way to optimize functionality of the systems is parallelization of their calculations [1].In this way, the data model is shared among the processors so that the parts in terms of the calculations are independent.Bearing in mind that the divided data can be dynamically changed, causing a change of independent groups of data, it is necessary to reallocate the already partitioned data.In this paper, algorithms for dynamic data repartitioning that are adjusted for NUMA multiprocessor architecture are discussed.
NUMA multiprocessor computer is organized in nodes in which each node has a set of processors and part of the main memory.The processor topology determines the memory access time to data, and therefore recognizes the different levels of access to data.
This paper analyzes data models in DMS system.These systems contain and process data on the distribution power network to perform the supervision, management and planning of electric power network.DMS systems are composed of the following components: 1) SCADA system that takes the value of remote terminal units (RTUs) and sets dynamic values; 2) DMS analytical functions that are performed in order to optimize the operation, planning and operation of networks in alarm situations; 3) a database that stores data and history of data changes in the system.Following the dynamics of change of certain parameters in the system (such as the change in status of switches) DMS system needs to promptly determine the state of the system and possibly initiate some control actions.
The data model of such systems represents radial weakly meshed networks that are suitable to be divided into smaller sub-models, which will be used by parallel tasks as independent data model partitions.The aim of partitioning is to optimally divide data in order to minimize data relations across these subsets.Also, by making partitions of the similar size the computation load of every processor in the multiprocessor system might be balanced.In order to achieve this balance, graph based partitioning methods are used [2].Therefore, a graph is created out of the data model and the optimization problem is set as a problem of graph partitioning.
The optimal partitioning can be done in two complementary ways: 1) initially -before starting the system and 2) dynamically -while the system works (online) [3].The dynamic repartitioning is applied while the system is working (on line) -triggered by changes in inputs (that affects the calculations) or periodically if the processor load is imbalanced [6].When such imbalance is detected, the dynamic repartitioning is started and data are migrated among processors' local memories.
Cybenco [4] used the diffusion method for dynamic repartitioning.Following his research other diffusion algorithms for dynamic load balancing were developed.In [5] the diffusion algorithm for dynamic load balancing is described.In that algorithm repartitions are defined by calculating the Lagrange multiplier and using Laplacian matrix.The most studied techniques of dynamic load balancing are scratch-remap and diffusion algorithms [6,7], which minimize data migration by extending load balancing and number of connections between partitions (external edges) with additional optimization criteria.Diffusion algorithms show better results than scratchremap [6,7] because they reduce data migration and the external edges.More recent diffusion algorithms described in [8] make further reductions in data migration and the external edges.However, they are slow and not applicable to on-line systems.Finally, the CP algorithm was studied because it requires the least data migrations.
In the case of NUMA multiprocessor systems, different levels of memory and speed of data access are discussed depending on the system architecture.In [9] multilevel algorithms for load balancing processors are studied.Different memory access levels are considered and balancing is carried out first among processors with the same level of memory access and then processors on the following memory access levels.According to this, specialized multilevel algorithms for dynamic graph partitioning in NUMA multiprocessor systems are developed.
In this paper, simple Diffusion Repartitioning (DR) and cut-paste (CP) algorithms for dynamic partitioning are applied.Additionally, modified versions of these algorithms (MDR and MCP) are developed to support dynamic repartitioning running in NUMA multiprocessor systems.MDR and MCP algorithms are giving better experimental results than DR and CP algorithms.
The content of the paper is the following: In Section 2, the problem and used terminology is described.The details of the discussed data model and definition of the optimization problem are presented as well; Section 3 describes proposed algorithms for dynamic repartitioning; Section 4 describes experimental setup, presents and discusses the results.Section 5 is a conclusion.

Problem definition
It is expected that control and supervision of the DMS system results in a quick respond to changes of certain parameters and calculations of necessary analytical DMS functions.The most significant DMS functions are: Topology analysis; Load Flow, State Estimation, Volt/Var Control, etc.This paper will focus on the optimization of functions with the calculations carried out on parts of the network (Load Flow and State Estimation are the most often used functions).Bearing in mind simplicity it is assumed that only one function ξ is running in the system and it is needed to optimally partition the data model in order to finish data processing as quickly as possible.
In order to analyze the problem more comprehensively, the data model and optimization criterion for the data partitioning will be described.
Data Model.A connectivity model that contains basic data types important for the calculations is shown in Figure 1.The model is based on the Common Information Model (CIM) [10,11] for the purpose of efficient calculation (and memory relaxation).CIM is the most important standard for power energy systems and it is published by the IEC (International Electrotechnical Commission) as a part of their international standard IEC 61970-301 [10].The CIM data model is an abstract object model that represents all major entities (and relations among them) in an electric utility enterprise.The data model is based on the related objects with preserved relationships between objects.Each object models one type of the electric element, and it contains all instances of the object type together with their attributes.
The connectivity model is composed of transformer substations (represented by PowerSystemResource) connected with power lines (ACLineSegment), and they are used to supply groups of consumers (EnergyConsumer).Substations contain various equipment (ConductionEquipment and PowerTransformer) and nodes (ConnectivityNode) that connect such equipments.ConductionEquipment objects are modelled with single-or double-ended conductors, and these ends are always connected with ConnectivityNode(s).For example, typical types of single-ended conductors are EnergyConsumer, EnergySource, and BusbarSection, while types of Switch (Breaker, Fuse), ACLineSegment, and others, are doubleended conductors.In essence, the connectivity model is a branch-node model suitable for a graph presentation, where edges and vertices are instances of ConductionEquipment and ConnectivityNode, respectively.This also means that the elements are connected and they are called neighbours.
Relationships between the elements can be temporary and specific to their state (which can be changed externally).The relation between neighbours that depends on an input value u ij (for example, switch state) could be temporarily inactive and it is called potential connection Pot(α i , α j , u ij ).In other words, relations between elements depend on input values, i.e. when a potential connection is activated two elements become neighbours [3] . ) , ( ) , , ( ) , ( Therefore, the potential connection is a relation that is characterized by the state (active or inactive).If the state is active then the elements are connected and they must be used for calculation together.In opposite, if the state is inactive, elements are not connected.
The set of mutually connected elements is called which is the smallest data unit that can be processed by .
An overall connectivity data model of a radial or weakly meshed power distribution network presented as a graph is suitable for parallel calculations.The most commonly used power function is the Load Flow (LF) [12], which uses all galvanic connected electrical elements that are powered from the same source (EnergySource).Therefore, LF is the discussed calculation function, and elements that are used in the calculations are the electric elements.A set of all galvanically connected elements, which are simultaneously used in the LF calculation, is called a root, and builds a region as a vertex of the coarse graph.Region weight is equal to the number of elements in the given root.Therefore, open switches, making these edges potential connections (which are in the inactive state), determine the boundary between regions (graph vertices).The two regions are united by closing the boundary switch (active state) and the calculation region is expanded to contain these two regions.The number of open switches between two regions determines the weight of the edge between the graph vertices, i.e. regions.Data Model Partitioning.In a multiprocessor environment, function  can be applied to individual regions in parallel.If the number of regions is bigger than the number of processors, regions are grouped into p partitions (the result of partitioning is the set of partitions ) and distributed to p processors.This implies that a single processor sequentially executes the function  against the whole partition.
Regions could change over time because connections among elements depend on input data (1).When a potential connection between two elements from different regions is activated, these two regions have to be merged.It is also expected that a deactivated potential connection could cause splitting of the region.Dynamic changes of inputs (1) cause activation/deactivation of potential connections which makes a number of potential connections between regions variable.The quantitative indicator of probability to connect two regions can be the number of potential connections between their elements, and it is used as an optimization criterion for grouping regions into partitions.Therefore, the potential connections can be treated as a possibility for energizing two elements from the same source.
Data about the connectivity model are used to make the initial graph.As mentioned before, the structure of this data model is organized as a set of radial networks that can be weakly meshed [1,12], and the coarsening of the graph is completely established on the physical structure of the system.Consequently, the size of the coarsened graph is much smaller than the size of the initial graph and the number of potential connections is close to the number of regions, which is extremely important for efficient graph partitioning.
In this research, the coarsening phase is applied first in order to group mutually connected elements into regions (2).The resulting domain D is described as a weighted undirected graph, G = (V, E) made of vertices (V={v 1 ,v 2 ,…,v n }) and edges (E={e 1 ,e 2 ,…,e m }).The weight of the edge e i is marked as w(e i ) and the weight of vertex v j is marked as w(v j ).A vertex v i represents region R i with weight as the number of all its elements (w It is assumed that the vertex weight is directly related to the region's calculation complexity. Graph edges represent potential connections between elements from different regions.Two regions, R p and R q , could have many potential connections and they are presented by using one edge e(R p ,R q ) the weight of which is equal to the total number of such potential connection.This edge e(R p ,R q ) represents a potential connection between regions Pot Rp,Rq .If two regions have a potential connection they are called neighbours.
In order to define optimization criterion for a partition  k , we need to define the partition weight as the sum of all contained regions' weights.We also need to define a function  k as , , , where regions R p and R q are in  k .This function is an indicator of "good connectivity" between the regions in the partition.
First it is necessary to group regions into a defined number of partitions (p), so that the weights of these partitions are approximately the same.They can never be greater than the maximal partition weight M defined as: where where function  k is given in ( 4), and all partition weights are constrained by: .
It should be noted that the maximal number of connections inside each of the partitions means a minimal number of connections among partitions.
Two partitions  x and  y are neighbours if their regions are neighbours.Set of neighbouring partitions by the partition  x is marked as N( x ).

Dynamic (on-line) Optimization.
If some partitions have weights greater than the maximal partition weight ( k W  >M), the graph is imbalanced.In an imbalanced graph, a partition is overbalanced if its weight is greater than maximal partition weight (M).The graph is balanced when no partition is overbalanced [6,7].
Dynamic repartitioning is needed when an imbalance between the processors' load is detected, which was caused by changes of region weights.This could happen when: 1) changes of inputs activate potential connections between the regions from the different partitions, or 2) changes of some other inputs make processing in certain regions more frequent (this case is not considered in the paper).In both cases, the calculation of region weights is required and the dynamic algorithm for region repartitioning is started.The on-line repartitioning task attempts to get balanced and optimally connected partitions by applying minimal migrations of the regions between the partitions.
Repartitioning of Regions Caused by Outer Influences.This is done when an input signal activates the potential connection between the regions.These regions are joined into the new region whose weight equals to the total weight of two related regions and all their edges are preserved.If the regions originate from different partitions, and if joining them makes the graph imbalanced, the online repartitioning is initiated.
Repartitioning Based on the System History.
Sometimes function  execution is unpredictable, triggered by various input changes, which makes estimates for region weights useless.Then measurements of processor loads can be used to obtain region weights, i.e. the weight w i is proportional to cumulative time needed for execution of  function in region R i .
External changes of the switch statuses and model updates (insert or delete certain elements) lead to load imbalance and cause the graph to change.In an on-line system statuses of switches are telemetered and collected from field devices.Changes of switch statuses are relatively rare.On the other hand, LF calculations are executed more frequently, which depends on other telemetered values used as inputs for the calculation.Because of such imbalance between the rate of changes of statuses of potential connections and number of calculation function executions, it is necessary to repartition regions in order to optimize use of resources, i.e. increase the speed of calculation.The importance of the research described in this paper reflects just that.

Algorithms
Four algorithms for dynamic repartitioning: DR and CP algorithms with their modifications are applied when partitions are imbalanced.Developed modifications of the algorithms are adapted to load balancing processors on NUMA multiprocessor systems [9].
Diffusion Repartitioning (DR).Diffusion algorithm design is based on Wavefront algorithm [7].The outline of the DR algorithm pseudo code is shown below.
function DR_Algorithm(G p ) Input: G p =(V,E) -partitioned imbalanced graph Output: balanced graph This is an iterative process based on the calculation of two arrays, outflow and inflow.Parameter outflow i  represents the sum of the region weights that partition  i is required to send to other partitions.It is calculated (CalculateOutflow) as the difference between the weight of partition Parameter inflow i  represents the sum of the region weights that partition  i is required to receive from other partitions.It is calculated (CalculateInflow) as the sum of inflow parameters of the partitions that are neighbours with In each iteration, a region (R) is selected (SelectRegion) from the partition ( S ) with the best ratio outflow/inflow (BestRatioPartition) and it is moved to another partition ( D ) to improve the balance and function F (GetBestNeigbour) [7].If balancing is not obtained, the least connected region (R) in all overbalanced partitions (found by GetBestMoving) is transferred to the lowest partition.
Cut-and-Paste Repartitioning (CP).In this repartitioning [6] each border region (BR -set of border regions) is visited randomly.If the region is in an overbalanced partition and is a neighbour with a nonoverbalanced partition, then the region will migrate to the non-overbalanced partition (only if it will not disturb the balance of the destination partition).If the region has several neighbours from different non-overbalanced partitions, then it will migrate to the partition that produces the greatest improvement in the optimization function F (GetBestNeigbour).After each border region is visited exactly once, the process repeats until either all partitions are balanced or no balancing progress is made.At the end, it is possible that the system is still imbalanced and then the least connected region (GetBestMoving) from the overbalanced partition is found and migrated to the smallest partition.The outline of the CP pseudo code is shown below.

Modified Diffusion Repartitioning (MDR).
The DR algorithm is modified to calculate inflow and outflow vectors at two hierarchical levels, which is suitable for NUMA.At the first level DR is applied to balance the sum of partition weights for all NUMA nodes, while at the second level the balance among partitions associated with a NUMA node is made.The outline of the MDR pseudo code is shown below.The algorithm is initiated by external changes in the state of potential connections between vertices from different partitions (nodes).If an imbalanced partition appears, the first phase of the algorithm is balancing between NUMA nodes (balancing of numaGraph -the graph with NUMA nodes as vertices).Therefore, two arrays are added -outflowNode and inflowNode.Parameter outflowNode [i] represents the sum of the region weights that the NUMA node i is required to send out to other nodes, and inflowNode[i] represents the sum of the region weights that the NUMA node i is required to receive from other nodes.First, parameters inflowNode and outflowNode are calculated, and they are used for deciding which regions should be moved from one NUMA node to others.After NUMA nodes are balanced, DR algorithm is applied to rearrange partitions in each node (subGraph -graph of partitions on a NUMA node).
Modified Cut-and-Paste Repartitioning (MCP).Introduced MCP algorithm slightly modifies CP in the process of moving regions to non-balanced partitions.The main idea of this algorithm is similar to the MDR algorithm -using repartitioning in two levels.At the first level the CP algorithm is applied only to those partitions that are placed in different NUMA nodes, while at the second level the CP algorithm is applied only to regions that are associated with the NUMA node.The outline of the MCP algorithm pseudo code is shown below.The algorithm starts when the change of the corresponding input (U r ) changes the status of potential connections.The first level deals with a graph which partitions correspond to NUMA nodes, and edges to the total weight of edges between the nodes (numaGraph).After that, the CP repartitioning algorithm is applied to numaGraph.The second phase of the algorithm involves independent repartitioning of graph related to the individual nodes (subGraph).In that case, the edges that connect vertices belonging to different nodes are not taken into consideration (deleting these edges appear unrelated subGraph's).Repartitioning algorithm of a given subGraph is done in the same way as a repartitioning of numaGraph -by applying the CP algorithm.

Experimental results
Tests were carried out regarding models of real electric power distribution network, the characteristics are shown in Table 1.
Model it206 is a power distribution network of city Milano (Italy), while bg5x is five times multiplied network model of Belgrade (Serbia).In this part, test results of the dynamic repartitioning will be shown.The following experiments were carried out on a NUMA system with 2 nodes with 4 cores per node (CPU AMD Opteron 2.4GHz, 8GB RAM per core).DR, CP, MDR and MCP algorithms were applied for dynamic repartitioning and tested on graphs it206 (Tests 1 and 2) and bgx5 (Tests 3 and 4).Tests were carried out on already partitioned graphs that were artificially made imbalanced.In all tests the imbalance is caused by changing activities of potential connections (1) between regions from different partitions.In this way, two regions become connected, making an imbalanced graph.In tests 1 and 3 new connections were made only between regions that belong to different NUMA nodes, forcing activities at the first level in MDR and MCP algorithms.On the other hand, tests 2 and 4 create new connections only between regions that belong to the same NUMA node.Each test was repeated 100 times.
Experimental results were obtained for function F, number of repartitioned regions (between partitions, i.e. processors), number of migrated regions (between NUMA nodes), total size of migrated regions and algorithm execution time (Table 2.).All experiments were performed using tolerance set at 10% ( = 0.1).Experiments show that DR obtains the best results because obtained optimal value F is minimal in almost all tests.Total size of migrations between regions is calculated only for data transferred between NUMA nodes.Using that as a comparison criterion both MDR and MCP were giving the best results (MDR was a bit better) while DR was the worst.Additionally, the number of regions that changed partition and the number of regions moved from one NUMA node to another is shown in Error!Reference source not found.2,as well.It is also evident that the MCP algorithm required the highest number of regions moving between partitions, although at the same time actual number of regions migrated to another NUMA node is the smallest.According to measured execution times the fastest algorithm is MCP, CP is a bit slower, and MDR algorithm is the slowest one (up to 4 times slower than MCP).
Taking all comparison criteria into consideration we conclude that MDR obtains better results than the other algorithms because of a smaller total migration size and good results for function F, although it executes slower than other algorithms.On the other hand, MCP is the fastest algorithm that obtains very good results in terms of data migration, however it is not as good for function F optimization as MDR is.Although, MCP provides worse results regarding function F, and taking into consideration that connections between partitions are potential connections, this criterion is less significant to us.

Conclusions
In this paper, we developed algorithms for dynamic repartitioning of large data model.Novel algorithms, MDR and MCP for dynamic repartitioning achieved very good results.For dynamic repartitioning we applied diffusion repartitioning (DR) and cut-and-paste (CP) algorithms.Additionally, these algorithms are modified and MDR and MCP algorithms are developed to support dynamic repartitioning running in NUMA multiprocessor systems.Experimental results prove that MDR and MCP obtain better results than DR and CP algorithms.Due to the fact that MDR algorithm is much slower in execution, we recommend using MCP algorithm for dynamic repartitioning in on-line systems.In the case when the system dynamic does not require fast algorithms, we recommend using MDR algorithm.

Fig. 1 .
Fig. 1.Connectivity model If two elements α i and α j are in relation N(α i , α j ) it is assumed that the processing function  will use them together (only relations meaningful to  are considered).