Distributed Computing (2008)
756 pág.

Distributed Computing (2008)

DisciplinaAlgorítimos Distribuídos32 materiais283 seguidores
Pré-visualização50 páginas
Although n= 2, all the events are strictly ordered, and there is only one linear
order of all the events that is consistent with the \u201cpartial\u201d order. Hence the
dimension of this \u201cpartial order\u201d is 1. A scalar clock such as one implemented
by Lamport\u2019s scalar clock rules is adequate to determine e\u227a f for any events
e and f in this execution.
Now consider an execution on processes P1 and P2 such that each sends
a message to the other before receiving the other\u2019s message. The two send
events are concurrent, as are the two receive events. To determine the causality
between the send events or between the receive events, it is not sufficient
to use a single integer; a vector clock of size n = 2 is necessary. This
execution exhibits the graphical property called a crown, wherein there are
some messages m0\ufffd 
 mn\u22121 such that Send\ufffdmi\ufffd \u227a Receive\ufffdmi+1mod \ufffdn\u22121\ufffd\ufffd for
all i from 0 to n\u22121. A crown of n messages has dimension n. We introduced
the notion of crown and studied its properties in Chapter 6.
For a complex execution, it is not straightforward to determine the
dimension of the partial order. Figure 3.3 shows an execution involving four
processes. However, the dimension of this partial order is two. To see this
59 3.5 Efficient implementations of vector clocks
Figure 3.3 Example illustrating
dimension of a execution
\ufffdE\ufffd\u227a\ufffd. For n= 4 processes,
the dimension is 2.
a b d g h i
a h i
(ii) two linear extensions
< c, e, f, a, b, d, g, h, i, j >
< a, b, c, d, g, h, i, e, j, f >
Range of events &quot;c,&quot; &quot;e,&quot; &quot;f &quot;
informally, consider the longest chain \ufffda\ufffdb\ufffdd\ufffd g\ufffdh\ufffd i\ufffd j\ufffd. There are events
outside this chain that can yield multiple linear extensions. Hence, the dimen-
sion is more than 1. The right side of Figure 3.3 shows the earliest possible
and the latest possible occurrences of the events not in this chain, with respect
to the events in this chain. Let \ufffd1 be \ufffdc\ufffd e\ufffd f\ufffda\ufffd b\ufffdd\ufffd g\ufffdh\ufffd i\ufffd j\ufffd, which contains
the following tuples that are not in \ufffd:
\ufffdc\ufffda\ufffd\ufffd \ufffdc\ufffd b\ufffd\ufffd \ufffdc\ufffdd\ufffd\ufffd \ufffdc\ufffd g\ufffd\ufffd \ufffdc\ufffdh\ufffd\ufffd \ufffdc\ufffd i\ufffd\ufffd \ufffdc\ufffd j\ufffd\ufffd
\ufffde\ufffda\ufffd\ufffd \ufffde\ufffd b\ufffd\ufffd \ufffde\ufffdd\ufffd\ufffd \ufffde\ufffd g\ufffd\ufffd \ufffde\ufffdh\ufffd\ufffd \ufffde\ufffd i\ufffd\ufffd \ufffde\ufffd j\ufffd\ufffd
\ufffdf\ufffda\ufffd\ufffd \ufffdf\ufffd b\ufffd\ufffd \ufffdf\ufffdd\ufffd\ufffd \ufffdf\ufffd g\ufffd\ufffd \ufffdf\ufffdh\ufffd\ufffd \ufffdf\ufffd i\ufffd\ufffd \ufffdf\ufffd j\ufffd\ufffd
Let \ufffd2 be \ufffda\ufffdb\ufffd c\ufffdd\ufffd g\ufffdh\ufffd i\ufffd e\ufffd j\ufffd f\ufffd, which contains the following tuples not
in \ufffd:
\ufffda\ufffd c\ufffd\ufffd \ufffdb\ufffd c\ufffd\ufffd \ufffdc\ufffdd\ufffd\ufffd \ufffdc\ufffd g\ufffd\ufffd \ufffdc\ufffdh\ufffd\ufffd \ufffdc\ufffd i\ufffd\ufffd \ufffdc\ufffd j\ufffd\ufffd
\ufffda\ufffd e\ufffd\ufffd \ufffdb\ufffd e\ufffd\ufffd \ufffdd\ufffd e\ufffd\ufffd \ufffdg\ufffd e\ufffd\ufffd \ufffdh\ufffd e\ufffd\ufffd \ufffdi\ufffd e\ufffd\ufffd \ufffde\ufffd j\ufffd\ufffd
\ufffda\ufffd f\ufffd\ufffd \ufffdb\ufffd f\ufffd\ufffd \ufffdd\ufffd f\ufffd\ufffd \ufffdg\ufffd f\ufffd\ufffd \ufffdh\ufffd f\ufffd\ufffd \ufffdi\ufffd f\ufffd\ufffd \ufffdj\ufffd f\ufffd\ufffd
Further, observe that \ufffd\ufffd1 \ P\ufffd
\ufffd2 = \u2205 and \ufffd\ufffd2 \ P\ufffd
\ufffd1 = \u2205. Hence,
\ufffd2 = \ufffd and the dimension of the execution is 2 as these two linear
extensions are enough to generate \ufffd .
Unfortunately, it is not computationally easy to determine the dimension
of a partial order. To exacerbate the problem, the above form of analysis has
to be completed a posteriori (i.e., off-line), once the entire partial order has
been determined after the completion of the execution.
3.5 Efficient implementations of vector clocks
If the number of processes in a distributed computation is large, then vector
clocks will require piggybacking of huge amount of information in messages
for the purpose of disseminating time progress and updating clocks. The
60 Logical time
message overhead grows linearly with the number of processors in the sys-
tem and when there are thousands of processors in the system, the mes-
sage size becomes huge even if there are only a few events occurring in
few processors. In this section, we discuss efficient ways to maintain vec-
tor clocks; similar techniques can be used to efficiently implement matrix
Charron-Bost showed [2] that if vector clocks have to satisfy the strong
consistency property, then in general vector timestamps must be at least of
size n, the total number of processes. Therefore, in general the size of a vector
timestamp is the number of processes involved in a distributed computation;
however, several optimizations are possible and next, we discuss techniques
to implement vector clocks efficiently [19].
3.5.1 Singhal\u2013Kshemkalyani\u2019s differential technique
Singhal\u2013Kshemkalyani\u2019s differential technique [25] is based on the observa-
tion that between successive message sends to the same process, only a few
entries of the vector clock at the sender process are likely to change. This
is more likely when the number of processes is large because only a few of
them will interact frequently by passing messages. In this technique, when a
process pi sends a message to a process pj , it piggybacks only those entries
of its vector clock that differ since the last message sent to pj .
The technique works as follows: if entries i1\ufffd i2\ufffd 
 \ufffd in1 of the vector clock
at pi have changed to v1\ufffd v2\ufffd 
 \ufffd vn1 , respectively, since the last message sent
to pj , then process pi piggybacks a compressed timestamp of the form
	\ufffdi1\ufffd v1\ufffd\ufffd \ufffdi2\ufffd v2\ufffd\ufffd 
 \ufffd \ufffdin1\ufffd vn1\ufffd\ufffd
to the next message to pj . When pj receives this message, it updates its vector
clock as follows:
vti\ufffdik\ufffd=max\ufffdvti\ufffdik\ufffd\ufffd vk\ufffd for k= 1\ufffd2\ufffd 
 \ufffd n1.
Thus this technique cuts down the message size, communication bandwidth
and buffer (to store messages) requirements. In the worst of case, every
element of the vector clock has been updated at pi since the last message
to process pj , and the next message from pi to pj will need to carry the
entire vector timestamp of size n. However, on the average the size of the
timestamp on a message will be less than n. Note that implementation of
this technique requires each process to remember the vector timestamp in
the message last sent to every other process. Direct implementation of this
will result in O\ufffdn2\ufffd storage overhead at each process. This technique also
requires that the communication channels follow FIFO discipline for message
Singhal and Kshemkalyani developed a clever technique that cuts down
this storage overhead at each process to O\ufffdn\ufffd. The technique works in
61 3.5 Efficient implementations of vector clocks
the following manner: process pi maintains the following two additional
\u2022 LSi\ufffd1
 n\ufffd (\u2018Last Sent\u2019):
LSi\ufffdj\ufffd indicates the value of vti\ufffdi\ufffd when process pi last sent a message to
process pj .
\u2022 LUi\ufffd1
 n\ufffd (\u2018Last Update\u2019):
LUi\ufffdj\ufffd indicates the value of vti\ufffdi\ufffd when process pi last updated the entry
Clearly, LUi\ufffdi\ufffd = vti\ufffdi\ufffd at all times and LUi\ufffdj\ufffd needs to be updated only
when the receipt of a message causes pi to update entry vti\ufffdj\ufffd. Also, LSi\ufffdj\ufffd
needs to be updated only when pi sends a message to pj . Since the last
communication from pi to pj , only those elements k of vector clock vti\ufffdk\ufffd
have changed for which LSi\ufffdj\ufffd < LUi\ufffdk\ufffd holds. Hence, only these elements
need to be sent in a message from pi to pj . When pi sends a message to pj ,
it sends only a set of tuples,
	\ufffdx\ufffd vti\ufffdx\ufffd\ufffd\ufffdLSi\ufffdj\ufffd < LUi\ufffdx\ufffd\ufffd,
as the vector timestamp to pj , instead of sending a vector of n entries in a
Thus the entire vector of size n is not sent along with a message. Instead,
only the elements in the vector clock that have changed since the last
message send to that process are sent in the format 	\ufffdp1\ufffd latest_value\ufffd\ufffd
\ufffdp2\ufffd latest_value\ufffd\ufffd 
 \ufffd, where pi indicates that the pith component of the
vector clock has changed.
This method is illustrated in Figure 3.4. For instance, the second message
from p3 to p2 (which contains a timestamp 	\ufffd3\ufffd2\ufffd\ufffd) informs p2 that the
third component of the vector clock has been modified and the new value
is 2. This is because the process p3 (indicated by the third component of
Figure 3.4 Vector
clocks progress in
technique [19].
{(3,1)} {(3,2)} {(3,4),(4,1)}
62 Logical time