756 pág.

# Distributed Computing (2008)

DisciplinaAlgorítimos Distribuídos32 materiais283 seguidores
Pré-visualização50 páginas
```in a distributed computation
another instance of constructing all consistent snapshots that contain check-
points from a given set, we make a recursive call (line 14), passing T \u222aC
and a ProcSet from which process pq is removed. The recursion eventually
terminates when the passed set contains checkpoints from all processes (i.e.,
ProcSet is empty). In this case T is a global snapshot, as it contains one
checkpoint from every process, and is added to G (line 10). When the algo-
rithm terminates, all candidates in Suseful have been used in extending S, so
G contains all consistent snapshots that contain S.
The following theorem argues the correctness of the algorithm.
Theorem 4.3 Let S be a set of checkpoints and G be the set returned by
ComputeAllCgs\ufffdS\ufffd. If S \ufffd\ufffdS, then T \u2208 G if and only if T is a consistent
snapshot containing S. That is, G contains exactly the consistent snapshots
that contain S.
We omit the proof of the theorem and interested readers can refer to the
original paper [21] for a proof.
4.9.3 Finding Z-paths in a distributed computation
Tracking Z-paths on-the-fly is difficult and remains an open problem. We
describe a method for determining the existence of Z-paths between check-
points in a distributed computation that has terminated or has stopped
execution, using the rollback-dependency graph (R-graph) introduced by
Wang [33, 34]. First, we present the definition of an R-graph.
definition 4.5 The rollback-dependency graph of a distributed computation
is a directed graph G= \ufffdV\ufffdE\ufffd, where the vertices V are the checkpoints of
the distributed computation, and an edge \ufffdCp\ufffdi\ufffdCq\ufffdj\ufffd from checkpoint Cp\ufffdi to
checkpoint Cq\ufffdj belongs to E if
1. p= q and j = i+1, or
2. p \ufffd= q and a message m sent from the ith checkpoint interval of pp is
received by pq in its jth checkpoint interval (i\ufffd j > 0).
Construction of an R-graph
When a process pp sends a message m in its ith checkpoint interval, it
piggybacks the pair \ufffdp\ufffd i\ufffd with the message. When the receiver pq receives
m in its jth checkpoint interval, it records the existence of an edge from Cp\ufffdi
to Cq\ufffdj . When a process wants to construct the R-graph for finding Z-paths
between checkpoints, it broadcasts a request message to collect the existing
direct dependencies from all other processes and constructs the complete R-
graph. We assume that each process stops execution after it sends a reply
to the request so that additional dependencies between checkpoints are not
formed while the R-graph is being constructed. For each process, a volatile
120 Global state and snapshot recording algorithms
Figure 4.7 A distributed
computation.
p2
p3
p1
m6m2
C1,0 C1,1
C2,0
C3,0 C3,1
C2,1
C1,2
C2,2
C3,2
m3
m1 m4
m5
Figure 4.8 The R-graph of the
computation in Figure 4.7.
C3,1 C3,3C3,2
C3,0
C2,3
C2,2
C2,1C2,0
C1,0 C1,1 C1,2 C1,3
Volatile
checkpoints
checkpoint is added; the volatile checkpoint represents the volatile state of
the process [33, 34].
Example 4.1 An R-graph Figure 4.8 shows the R-graph of the computa-
tion shown in Figure 4.7. In Figure 4.8, C1\ufffd3\ufffdC2\ufffd3\ufffd and C3\ufffd3 represent the
volatile checkpoints, the checkpoints representing the last state the process
attained before terminating.
We denote the fact that there is a path from C to D in the R-graph by
C
rd
\ufffd D. It only denotes the existence of a path; it does not specify any
particular path. For example, in Figure 4.8, C1\ufffd0
rd
\ufffd C3\ufffd2. When we need to
specify a particular path, we give the sequence of checkpoints that constitute
the path. For example, \ufffdC1\ufffd0\ufffdC1\ufffd1\ufffdC1\ufffd2\ufffdC2\ufffd1\ufffdC3\ufffd1\ufffdC3\ufffd2\ufffd is a path from C1\ufffd0 to
C3\ufffd2 and \ufffdC1\ufffd0\ufffdC1\ufffd1\ufffdC1\ufffd2\ufffdC2\ufffd1\ufffdC2\ufffd2\ufffdC2\ufffd3\ufffdC3\ufffd2\ufffd is also a path from C1\ufffd0 to C3\ufffd2.
The following theorem establishes the correspondence between the paths
in the R-graph and the Z-paths between checkpoints. This correspondence is
very useful in determining whether or not a Z-path exists between two given
checkpoints.
Theorem 4.4 Let G= \ufffdV\ufffdE\ufffd be the R-graph of a distributed computation.
Then, for any two checkpoints Cp\ufffdi and Cq\ufffdj , Cp\ufffdi\ufffdCq\ufffdj if and only if
1. p= q and i < j, or
2. Cp\ufffdi+1
rd
\ufffd Cq\ufffdj in G (note that in this case p could still be equal to q).
For example, in the distributed computation shown in Figure 4.7, a zigzag
path exists from C1\ufffd1 to C3\ufffd1 because in the corresponding R-graph, shown
in Figure 4.8, C1\ufffd2
rd
\ufffd C3\ufffd1. Likewise, C2\ufffd1 is on a Z-cycle because in the
corresponding R-graph, shown in Figure 4.8, C2\ufffd2
rd
\ufffd C2\ufffd1.
121 4.10 Chapter summary
4.10 Chapter summary
Recording global state of a distributed system is an important paradigm in
the design of the distributed systems and the design of efficient methods of
recording the global state is an important issue. Recording of global state of
a distributed system is complicated due to the lack of both a globally shared
memory and a global clock in a distributed system. This chapter first presented
a formal definition of the global state of a distributed system and exposed
issues related to its capture; it then described several algorithms to record a
snapshot of a distributed system under various communication models.
Table 4.1 gives a comparison of the salient features of the various snapshot
recording algorithms. Clearly, the higher the level of abstraction provided by
a communication model, the simpler the snapshot algorithm. However, there
is no best performing snapshot algorithm and an appropriate algorithm can be
chosen based on the application\u2019s requirement. For examples, for termination
detection, a snapshot algorithm that computes a channel state as the number
of messages is adequate; for checkpointing for recovery from failures, an
incremental snapshot algorithm is likely to be the most efficient; for global
state monitoring, rather than recording and evaluating complete snapshots at
regular intervals, it is more efficient to monitor changes to the variables that
affect the predicate and evaluate the predicate only when some component
variable changes.
As indicated in the introduction, the paradigm of global snapshots finds a
large number of applications (such as detection of stable properties, check-
pointing, monitoring, debugging, analyses of distributed computation, dis-
carding of obsolete information). Moreover, in addition to the problems they
solve, the algorithms presented in this chapter are of great importance to
people interested in distributed computing as these algorithms illustrate the
incidence of properties of communication channels (FIFO, non-FIFO, causal
ordering) on the design of a class of distributed algorithms.
We also discussed the necessary and sufficient conditions for consistent
snapshots. The non-causal path between checkpoints in a snapshot corre-
sponds to the necessary condition for consistent snapshot, and the non-zigzag
path corresponds to the necessary and sufficient conditions for consistent
snapshot. Tracking of zigzag path is helpful in forming a global consistent
snapshot. The avoidance of zigzag path between any pair of checkpoints from
a collection of checkpoints (snapshot) is the necessary and sufficient condi-
tions for a consistent global snapshot. Avoidance of causal paths alone will
not be sufficient for consistency.
We also presented an algorithm for finding all consistent snapshots con-
taining a given set S of local checkpoints; if we take S=\u2205, then the algorithm
gives the set of all consistent snapshots of a distributed computation run.
We established the correspondence between the Z-paths and the paths in the
R-graph which helps in finding the existence of Z-paths between checkpoints.
122 Global state and snapshot recording algorithms
4.11 Exercises
Exercise 4.1 Consider the following simple method to collect a global snapshot (it
may not always collect a consistent global snapshot): an initiator process takes its
snapshot```