Pré-visualização50 páginas

55 3.4 Vector time of process P1 has smaller scalar timestamp than the third event of process P2. However, the former did not happen before the latter. The reason that scalar clocks are not strongly consistent is that the logical local clock and logical global clock of a process are squashed into one, resulting in the loss causal dependency information among events at different processes. For example, in Figure 3.1, when process P2 receives the first message from process P1, it updates its clock to 3, forgetting that the timestamp of the latest event at P1 on which it depends is 2. 3.4 Vector time 3.4.1 definition The system of vector clocks was developed independently by Fidge [4], Mattern [12], and Schmuck [23]. In the system of vector clocks, the time domain is represented by a set of n-dimensional non-negative integer vectors. Each process pi maintains a vector vti\ufffd1\ufffd\ufffdn\ufffd, where vti\ufffdi\ufffd is the local logical clock of pi and describes the logical time progress at process pi. vti\ufffdj\ufffd rep- resents process pi\u2019s latest knowledge of process pj local time. If vti\ufffdj\ufffd = x, then process pi knows that local time at process pj has progressed till x. The entire vector vti constitutes pi\u2019s view of the global logical time and is used to timestamp events. Process pi uses the following two rules R1 and R2 to update its clock: \u2022 R1 Before executing an event, process pi updates its local logical time as follows: vti\ufffdi\ufffd \ufffd= vti\ufffdi\ufffd+d \ufffdd > 0\ufffd\ufffd \u2022 R2 Each message m is piggybacked with the vector clock vt of the sender process at sending time. On the receipt of such a message (m,vt), process pi executes the following sequence of actions: 1. update its global logical time as follows: 1\u2264 k\u2264 n \ufffd vti\ufffdk\ufffd \ufffd=max\ufffdvti\ufffdk\ufffd\ufffd vt\ufffdk\ufffd\ufffd\ufffd 2. execute R1; 3. deliver the message m. The timestamp associated with an event is the value of the vector clock of its process when the event is executed. Figure 3.2 shows an example of vector clocks progress with the increment value d = 1. Initially, a vector clock is \ufffd0\ufffd0\ufffd0\ufffd \ufffd \ufffd0\ufffd. 56 Logical time Figure 3.2 Evolution of vector time [19]. 2 0 0 2 3 4 5 3 4 2 3 0 2 0 0 3 0 0 4 3 4 0 1 0 2 2 0 2 3 0 2 4 0 5 6 4 0 0 1 2 3 3 2 3 4 2 3 2 1 0 0 5 3 4 5 5 4 p1 p2 p3 The following relations are defined to compare two vector timestamps, vh and vk: vh= vk \u21d4 \u2200x \ufffd vh\ufffdx\ufffd= vk\ufffdx\ufffd vh\u2264 vk \u21d4 \u2200x \ufffd vh\ufffdx\ufffd\u2264 vk\ufffdx\ufffd vh < vk \u21d4 vh\u2264 vk and \u2203x \ufffd vh\ufffdx\ufffd < vk\ufffdx\ufffd vh \ufffd vk \u21d4 ¬\ufffdvh < vk\ufffd\u2227¬\ufffdvk < vh\ufffd\ufffd 3.4.2 Basic properties Isomorphism Recall that relation \u201c\u2192\u201d induces a partial order on the set of events that are produced by a distributed execution. If events in a distributed system are timestamped using a system of vector clocks, we have the following property. If two events x and y have timestamps vh and vk, respectively, then x\u2192 y \u21d4 vh < vk x \ufffd y \u21d4 vh \ufffd vk\ufffd Thus, there is an isomorphism between the set of partially ordered events produced by a distributed computation and their vector timestamps. This is a very powerful, useful, and interesting property of vector clocks. If the process at which an event occurred is known, the test to compare two timestamps can be simplified as follows: if events x and y respectively occurred at processes pi and pj and are assigned timestamps vh and vk, respectively, then x\u2192 y \u21d4 vh\ufffdi\ufffd\u2264 vk\ufffdi\ufffd x \ufffd y \u21d4 vh\ufffdi\ufffd > vk\ufffdi\ufffd\u2227vh\ufffdj\ufffd < vk\ufffdj\ufffd\ufffd 57 3.4 Vector time Strong consistency The system of vector clocks is strongly consistent; thus, by examining the vector timestamp of two events, we can determine if the events are causally related. However, Charron\u2013Bost showed that the dimension of vector clocks cannot be less than n, the total number of processes in the distributed com- putation, for this property to hold [2]. Event counting If d is always 1 in rule R1, then the ith component of vector clock at process pi, vti\ufffdi\ufffd, denotes the number of events that have occurred at pi until that instant. So, if an event e has timestamp vh, vh\ufffdj\ufffd denotes the number of events executed by process pj that causally precede e. Clearly, \u2211 vh\ufffdj\ufffd\u2212 1 represents the total number of events that causally precede e in the distributed computation. Applications Since vector time tracks causal dependencies exactly, it finds a wide variety of applications. For example, they are used in distributed debugging, imple- mentations of causal ordering communication and causal distributed shared memory, establishment of global breakpoints, and in determining the consis- tency of checkpoints in optimistic recovery. A brief historical perspective of vector clocks Although the theory associated with vector clocks was first developed in 1988 independently by Fidge and Mattern, vector clocks were informally introduced and used by several researchers before this. Parker et al. [17] used a rudimentary vector clocks system to detect inconsistencies of replicated files due to network partitioning. Liskov and Ladin [11] proposed a vector clock system to define highly available distributed services. Similar system of clocks was used by Strom and Yemini [26] to keep track of the causal dependencies between events in their optimistic recovery algorithm and by Raynal to prevent drift between logical clocks [18]. Singhal [24] used vector clocks coupled with a boolean vector to determine the currency of a critical section execution request by detecting the cusality relation between a critical section request and its execution. 3.4.3 On the size of vector clocks An important question to ask is whether vector clocks of size n are necessary in a computation consisting of n processes. To answer this, we examine the usage of vector clocks. \u2022 A vector clock provides the latest known local time at each other process. If this information in the clock is to be used to explicitly track the progress at every other process, then a vector clock of size n is necessary. 58 Logical time \u2022 A popular use of vector clocks is to determine the causality between a pair of events. Given any events e and f , the test for e \u227a f if and only if T\ufffde\ufffd < T\ufffdf\ufffd, which requires a comparison of the vector clocks of e and f . Although it appears that the clock of size n is necessary, that is not quite accurate. It can be shown that a size equal to the dimension of the partial order \ufffdE\ufffd\u227a\ufffd is necessary, where the upper bound on this dimension is n. This is explained below. To understand this result on the size of clocks for determining causal- ity between a pair of events, we first introduce some definitions. A linear extension of a partial order \ufffdE\ufffd\u227a\ufffd is a linear ordering of E that is consistent with the partial order, i.e., if two events are ordered in the partial order, they are also ordered in the linear order. A linear extension can be viewed as projecting all the events from the different processes on a single time axis. However, the linear order will necessarily introduce ordering between each pair of events, and some of these orderings are not in the partial order. Also observe that different linear extensions are possible in general. Let \ufffd denote the set of tuples in the partial order defined by the causality relation; so there is a tuple \ufffde\ufffd f\ufffd in \ufffd for each pair of events e and f such that e\u227a f . Let \ufffd1, \ufffd2 denote the sets of tuples in different linear extensions of this partial order. The set \ufffd is contained in the set obtained by taking the intersection of any such collection of linear extensions \ufffd1, \ufffd2 . This is because each \ufffdi must contain all the tuples, i.e., causality dependencies, that are in \ufffd . The dimension of a partial order is the minimum number of linear extensions whose intersection gives exactly the partial order. Consider a client\u2013server interaction between a pair of processes. Queries to the server and responses to the client occur in strict alternating