Distributed cycle detection in large-scale sparse graphs PDF_Java知识分享网-免费Java资源下载

Distributed cycle detection in large-scale sparse graphs PDF 下载

本站整理下载：

链接：https://pan.baidu.com/s/1G5J5AIrSG68LPZElp6fxUw

提取码：3o8z

相关截图：

主要内容：

In this paper we present a distributed algorithm for detecting cycles in large-scale directed graphs, along with its correctness proof and analysis. The algorithm is then extended to find

strong components in directed graphs. We indicate an application to detecting cycles in number

theoretic functions such as the proper divisor function. Our prototype implementation of the cycle detection algorithm, when applied to the proper divisor function, detects all sociable groups of

numbers (cycles in the proper divisor function) up to 107.

KEYWORDS. Graph theory, cycle detection, distributed algorithms.

Main Area: TAG - Theory and Algorithms in Graphs.

RESUMO

Nesse artigo nos apresentamos um algoritmo distribu ´ ´ıdo para detectar ciclos em grafos

massivos direcionados, juntamente com a sua analise e prova de corretude. O algoritmo ´ e exten- ´

dido para detectar componentes fortemente conectadas em grafos direcionados. Indicamos uma

aplicac¸ao para a detecc¸ ˜ ao de ciclos em func¸ ˜ oes de teoria dos n ˜ umeros tal como a func¸ ´ ao dos di- ˜

visores proprios. Nosso prot ´ otipo de implementac¸ ´ ao do algoritmo de detecc¸ ˜ ao de ciclos, quando ˜

aplicado a func¸ ` ao dos divisores pr ˜ oprios, detecta todos os grupos de n ´ umeros soci ´ aveis (ciclos na ´

func¸ao dos divisores pr ˜ oprios) at ´ e´ 107.

PALAVRAS CHAVE. Teoria dos grafos, detecc¸ao de ciclos, algoritmos distbu ˜ ´ıdos.

Area Principal: TAG - Teoria e Algoritmos em Grafos. ´

De 25 a 28 de Agosto de 2015. XLVII Porto de Galinhas, Pernambuco-PE

SIMPÓSIO BRASILEIRO DE PESQUISA OPERACIONAL

3644

1. Introduction

The growing scale and importance of graph data has driven the development of numerous

new parallel and distributed graph processing systems. For these massive data applications, the

resulting graphs can have billions of connections, and are usually highly sparse with complex,

irregular, and often a power-law structure [28].

There are great many theoretical and practical problems that deal with sparse graphs. For

example, the Internet is a sparse network with average degree less than 4, see [8] and [10]. The

networks of citations of scientific papers are also sparse, with the average number of references in a

paper of the order of 10 [19]. Similarly, data sets of biological interactions often constitute sparse

networks [7, 21], e.g., protein-protein interaction networks, gene regulation networks, etc. Biological neural networks are also represented by sparse networks [25]. Problems involving number

theoretic functions also lead to sparse graphs, as described in Section 3.1.4.

Most parallel graph processing systems, such as Pregel [16], GraphX [29, 28], GPS [22],

and Giraph [3], are mainly based on the bulk synchronous parallel model [26]. PowerGraph [12]

supports both bulk synchronous and asynchronous implementations.

In the bulk synchronous parallel model [26] for graph processing, the large input graph is

partitioned to the worker machines. Each worker machine is responsible for executing the vertices

that are assigned to it. Then, a superstep concept is used for coordinating the parallel execution

of computations on worker machines. A superstep consists of three parts: (i) concurrent computation, (ii) communication, (iii) synchronisation barrier. Every worker machine concurrently executes

computations for the vertices it is responsible for. Each worker machine sends messages on behalf

of the vertices it is responsible for to its neighbours. The neighbouring vertices may or may not be

in the same worker machine. When a worker machine reaches the barrier, it waits until all other

worker machines have finished their communication actions, before the system as a whole can move

to the next superstep. A computation involves many supersteps executed one after the other in this

manner. So, in a superstep, the worker uses values communicated via messages from the previous

superstep, instead of most recent values.

In this paper, we propose a distributed algorithm for detecting cycles on large-scale directed graphs based on the bulk synchronous message passing abstraction. The proposed algorithm for detecting cycles by message passing is suitable to be implemented in distributed graph

processing systems, and it is also suitable for implementations in systems for disk-based computations, such as the GraphChi [15], where the computation is mainly based on secondary memory.

Disk-based computations are necessary when we have a single computer for processing large-scale

graphs, and the computation exceeds the primary memory capacity.

The rest of this paper is structured as follows. Graph theoretic notation and preliminaries are fixed in Section 2. A cycle detection algorithm (Algorithm 1) based on message passing

is presented in Section 3.1. The correctness of Algorithm 1 is proved in Section 3.1.1. In Sections 3.1.3 and 3.1.4, the total number of iterations and the number of messages exchanged at each

iteration of Algorithm 1 are analysed. In Section 4, an algorithm for finding the strongly connected

components of a graph is presented (Algorithm 2); the algorithm makes use of the cycle detection

algorithm (Algorithm 1).

2. Graph theoretic notation

Throughout this paper we assume that G := (V, E) is a finite directed graph, where V

is the set of vertices, E ⊆ V × V is the set of arcs, ν(G) := |V |, and (G) := |E|. Let v ∈ V .

We denote the set of out-neighbours of v by N +(v), the set of in-neighbours of v by N (v), the

out-degree of v by d+(v), and the in-degree of v by d (v). We denote by ∆+(G) the maximum

out-degree of vertices of G. A walk in G is a sequence (v0, e1, v1, . . . , ek, vk), where each vi

a vertex, and each ei

is an arc from vi 1 to vi

. The length of a walk is the number of arcs in the

walk. A closed walk is a walk (v0, e1, v1, . . . , ek, vk) in which the origin v0 and the terminus vk are

equal. A cycle is a closed walk in which all vertices except the origin and the terminus are distinct.

De 25 a 28 de Agosto de 2015. XLVII Porto de Galinhas, Pernambuco-PE

SIMPÓSIO BRASILEIRO DE PESQUISA OPERACIONAL

3645

A cycle of length 0 has a single vertex and no arcs. In this paper, we consider only non-trivial

cycles, i.e., cycles containing at least one arc. A cycle of length 1 has one arc, and corresponds to a

loop in the graph. Since our definition of a directed graph permits at most one arc between any pair

of vertices, there is no ambiguity when we write a cycle of length k by a sequence of k + 1 vertices.

For example, (v) is a cycle of length 0, and (v, v) is a cycle of length 1, and so on. We refer to [27]

for standard graph theoretic notions not defined above.

3. Searching for cycles

An efficient method for detecting cycles in a directed graph is to use the depth-first search

(DFS) algorithm, considering the fact that a directed graph has a cycle if and only if DFS finds a back

arc. The running time of DFS on a directed graph G is Θ(ν(G) + (G)), and it is asymptotically

optimal [6].

Although DFS is asymptotically optimal, Reif [20] suggests that it cannot be effectively

parallelised, by proving the polynomial time completeness (P-completeness) of DFS. The remainder

of this section describes and analyses a distributed cycle detection algorithm with intrinsic potential

for parallelism.

3.1. Detecting cycles by message passing

In the context of big data, where the graph structure can be large enough to saturate the

processing power or memory capacity of a single machine, it is difficult to effectively parallelise

the DFS algorithm. Hence we need an algorithm that divides the problem into subproblems among

computational nodes, so that the nodes can search for cycles in a parallel manner with the certainty

that all cycles are found.

We propose a general algorithm for detecting cycles in a directed graph G by message

passing among its vertices, based on the bulk synchronous message passing abstraction. This is a

vertex-centric approach in which the vertices of the graph work together for detecting cycles. The

bulk synchronous parallel model consists of a sequence of iterations, in each of which a vertex

can receive messages sent by other vertices in the previous iteration, and send messages to other

vertices.

In each pass, each active vertex of G sends a set of sequences of vertices to its outneighbours as described next. In the first pass, each vertex v sends the message (v) to all its outneighbours. In subsequent iterations, each active vertex v appends v to each sequence it received

in the previous iteration. It then sends all the updated sequences to its out-neighbours. If v has not

received any message in the previous iteration, then v deactivates itself. The algorithm terminates

when all the vertices have been deactivated.

For a sequence (v1, v2, . . . , vk) received by vertex v, the appended sequence is not forwarded in two cases: (i) if v = v1, then v has detected a cycle, which is reported (see line 9 of

Algorithm 1); (ii) if v = vi for some i ∈ {2, 3, . . . , k}, then v has detected a sequence that contains

the cycle (v = vi

, vi+1, . . . , vk, vk+1 = v); in this case, the sequence is discarded, since the cycle

must have been detected in an earlier iteration (see line 11 of Algorithm 1); to be precise, this cycle

must have been detected in iteration k i + 1. Every cycle (v1, v2, . . . , vk, vk+1 = v1) is detected

by all vi

, i = 1 to k in the same iteration; it is reported by the vertex min{v1, . . . , vk} (see line 9 of

Algorithm 1).

The total number of iterations of the algorithm is the number of vertices in the longest

path in the graph, plus a few more steps for deactivating the final vertices. During the analysis of

the total number of iterations, we ignore the few extra iterations needed for deactivating the final

vertices and detecting the end of the computation, since it is O(1). In practice, the actual number

of these final few iterations depends on the framework being used to implement the algorithm.

We count iterations as i = 0, 1, . . .. Let M(v) i

be the set of messages (sequences of

vertices) received by v at iteration i. Since messages sent in iteration i = 0 are received in iteration

i = 1, M(v) 0 = ∅.

De 25 a 28 de Agosto de 2015. XLVII Porto de Galinhas, Pernambuco-PE

SIMPÓSIO BRASILEIRO DE PESQUISA OPERACIONAL 1 2 3 5 4

[1] [2] [3]

[3] [4]

1 2 3 5 4

halt [1, 2]

[4, 2] [2, 3]

[2, 3] [3, 4]

1 2 3 5 4

[3, 4, 2] [1, 2, 3]

[4, 2, 3]

[2, 3, 4] [1, 2, 3]

[4, 2, 3]

1 2 3 5 4

[1, 2, 3, 4]

output: [2, 3, 4]

1 2 3 5 4

halt halt

halt

1 2 3 5 4

halt

i = 0

i = 1

i = 2

i = 3

i = 4

i = 5

3646

Algorithm 1 Pseudocode for the compute function of the distributed cycle detection algorithm. The

algorithm takes G as input, and for each superstep i the function COMPUTE(M(v) i

) is executed for

each active vertex v.

1: function COMPUTE(M(v) i )

2: if i = 0 then

3: for each w ∈ N +(v) do

4: send (v) to w

5: else if M(v) i = ∅ then

6: deactivate v and halt

7: else

8: for each (v1, v2, . . . , vk) ∈ M(v) i

9: if v1 = v and min{v1, v2, . . . , vk} = v then

10: report (v1 = v, v2, . . . , vk, vk+1 = v)

11: else if v /∈ {v2, . . . , vk} then

12: for each w ∈ N +(v) do

13: send (v1, v2, . . . , vk, v) to w

Figure 1 presents an example of the execution of the algorithm. In iteration i = 3, all the

three vertices detect the cycle [2, 3, 4]. We ensure that the cycle is reported only once by emitting

the detected cycle only from the vertex with the least identifier value in the ordered sequence, which

is the vertex 2 in the example.

Figure 1: Example of the algorithm for detecting cycles by message passing.

3.1.1. A correctness proof using loop-invariant

Let C be the set of all non-trivial cycles in G (i.e., cycles with at least one arc). Let

S := (v1, v2, . . . , vi) be a message and u a vertex; we define S.u := (v1, v2, . . . , vi

, u) as the

concatenation of the message S with the vertex u.

De 25 a 28 de Agosto de 2015. XLVII Porto de Galinhas, Pernambuco-PE

SIMPÓSIO BRASILEIRO DE PESQUISA OPERACIONAL

3647

Loop-invariant: For i ≥ 1, the set Ci contains all non-trivial cycles of length i, which are detected

and reported in iteration i, and the set M(v) i

contains all messages of length i received by v in

iteration i.

Base case. In iteration i = 1, each vertex v receives a message (w) from each of its in-neighbours

w. Hence, M(v) 1 = {(w)|w ∈ N (v)}. If there is a loop at v, then v receives a message (v) in

iteration i = 1; in this case, v detects and reports the cycle (v, v) of length 1. Hence, at the end of

iteration i = 1, the set C1 contains all cycles of length 1.

Induction hypothesis. Suppose the loop invariant holds at the end of iteration i for some i ≥ 1, i.e.,

Ci contains all cycles of length i detected in iteration i, and M(v) i

contains all messages of length i

received by v in iteration i.

Inductive step. We prove that

M(v) i+1 = [w∈N (v){S.w|S := (v1, v2, . . . , vi) ∈ M(w) i

and w = vk, ∀k ∈ [1, i]}

and

Ci+1 = [v∈V (G){S.v|S := (v1, v2, . . . , vi+1) ∈ M(v) i+1 and v1 = v}.

By induction hypothesis, the set M(w) i

contains all messages of length i that reach w, for all

w ∈ N (v). If S := (v1, v2, . . . , vi) ∈ M(w) i

and w /∈ S, then w sends the message S.w := (v1, v2, . . . , vi

, vi+1 = w) (of length i + 1) to all its out-neighbours (one of them being v). Hence

the message S.w is received by v in iteration i+ 1. Thus M(v) i+1 is the set of messages of length i+ 1

that reach v in iteration i+ 1. To prove that Ci+1 contains all cycles of length i+ 1, observe that the

set M(v) i+1 contains (v1 = v, v2, . . . , vi+1) iff there exists a cycle (v1 = v, v2, . . . , vi

, vi+1, vi+2 = v)

of length i+1 that starts and finishes at vertex v. Therefore, the loop-invariant still holds at iteration

i + 1, and the algorithm constructs C = Sν(G) k=1 Ck.

3.1.2. Graph partitioning and communication by message passing

When considering distributed graph processing frameworks, the input graph partitioning

is crucial for achieving an efficient distributed execution. Considering that the graph structure describes data movement, minimisation of storage and communication overhead, and balanced computation depend on the graph partitioning performed by the framework.

Most of the frameworks for processing graphs try to optimise the partitioning strategy,

maximising the number of messages exchanged directly via shared memory communication. The

two most common partitioning strategies are based on edge-cut and vertex-cut.

In the edge-cut partitioning scheme [12, 29], the vertex set of a graph is partitioned into

blocks, and each block of the partition is processed on a distinct worker machine. Messages between

vertices in the same block are exchanged directly via main memory, reducing communication overhead and data movement via network. Since constructing an optimal edge-cut for large-scale graphs

can be prohibitively expensive, many graph processing frameworks use a random edge-cut (i.e., randomly distribute vertices across the cluster).

Vertex-cuts evenly assign edges to machines, and allow vertices to span multiple worker

machines. For power-law graphs, the vertex-cut strategy can reduce communication overhead and

ensure balanced computation by evenly assigning edges to machines in a way that minimises the

number of machines spanned by each vertex [12, 29].

3.1.3. The worst case and the average case analysis of the number of messages sent in

iteration t

Let G be a graph on n vertices. The worst case scenario occurs when G is a complete

directed graph with loops. In this case, we have n iterations. Let nt denote the falling factorial

最新Java全栈就业实战课程(免费)

AI人工智能学习大礼包

IDEA永久激活

66套java实战课程无套路领取

锋哥开始收Java学员啦！

Python学习路线图

Distributed cycle detection in large-scale sparse graphs PDF

Java1234官方群25：
Java1234官方群25：	838462530