Distributed Computing (2008)
756 pág.

Distributed Computing (2008)

DisciplinaAlgorítimos Distribuídos28 materiais273 seguidores
Pré-visualização50 páginas
difficult to build a completely synchronous system, and have the
messages delivered within a bounded time. Therefore, this synchrony has to
be simulated under the covers, and will inevitably involve delaying or block-
ing some processes for some time durations. Thus, synchronous execution is
an abstraction that needs to be provided to the programs. When implementing
this abstraction, observe that the fewer the steps or \u201csynchronizations\u201d of the
processors, the lower the delays and costs. If processors are allowed to have
an asynchronous execution for a period of time and then they synchronize,
then the granularity of the synchrony is coarse. This is really a virtually
synchronous execution, and the abstraction is sometimes termed as virtual
synchrony. Ideally, many programs want the processes to execute a series of
instructions in rounds (also termed as steps or phases) asynchronously, with
the requirement that after each round/step/phase, all the processes should be
synchronized and all messages sent should be delivered. This is the commonly
understood notion of a synchronous execution. Within each round/phase/step,
there may be a finite and bounded number of sequential sub-rounds (or sub-
phases or sub-steps) that processes execute. Each sub-round is assumed to
Figure 1.10 An example of a
synchronous execution in a
message-passing system. All
the messages sent in a round
are received within that same
phase 1 phase 2 phase 3 
21 1.7 Synchronous versus asynchronous executions
send at most one message per process; hence the message(s) sent will reach
in a single message hop.
The timing diagram of an example synchronous execution is shown in
Figure 1.10. In this system, there are four nodes P0 to P3. In each round,
process Pi sends a message to P\ufffdi+1\ufffdmod 4 and P\ufffdi\u22121\ufffd mod 4 and calculates some
application-specific function on the received values.
1.7.1 Emulating an asynchronous system by a synchronous system (A\u2192 S)
An asynchronous program (written for an asynchronous system) can be emu-
lated on a synchronous system fairly trivially as the synchronous system is a
special case of an asynchronous system \u2013 all communication finishes within
the same round in which it is initiated.
1.7.2 Emulating a synchronous system by an asynchronous system (S \u2192 A)
A synchronous program (written for a synchronous system) can be emulated
on an asynchronous system using a tool called synchronizer, to be studied in
Chapter 5.
1.7.3 Emulations
Section 1.5 showed how a shared memory system could be emulated by a
message-passing system, and vice-versa. We now have four broad classes of
programs, as shown in Figure 1.11. Using the emulations shown, any class
can be emulated by any other. If system A can be emulated by system B,
denoted A/B, and if a problem is not solvable in B, then it is also not solvable
in A. Likewise, if a problem is solvable in A, it is also solvable in B. Hence,
in a sense, all four classes are equivalent in terms of \u201ccomputability\u201d \u2013 what
can and cannot be computed \u2013 in failure-free systems.
Figure 1.11 Emulations
among the principal system
classes in a failure-free system.
message\u2212passing (SMP)
shared memory (ASM)
shared memory (SSM)
message\u2212passing (AMP)
22 Introduction
However, in fault-prone systems, as we will see in Chapter 14, this is not the
case; a synchronous system offers more computability than an asynchronous
1.8 Design issues and challenges
Distributed computing systems have been in widespread existence since the
1970s when the Internet and ARPANET came into being. At the time, the
primary issues in the design of the distributed systems included providing
access to remote data in the face of failures, file system design, and directory
structure design. While these continue to be important issues, many newer
issues have surfaced as the widespread proliferation of the high-speed high-
bandwidth internet and distributed applications continues rapidly.
Below we describe the important design issues and challenges after catego-
rizing them as (i) having a greater component related to systems design and
operating systems design, or (ii) having a greater component related to algo-
rithm design, or (iii) emerging from recent technology advances and/or driven
by new applications. There is some overlap between these categories. How-
ever, it is useful to identify these categories because of the chasm among the
(i) the systems community, (ii) the theoretical algorithms community within
distributed computing, and (iii) the forces driving the emerging applications
and technology. For example, the current practice of distributed comput-
ing follows the client\u2013server architecture to a large degree, whereas that
receives scant attention in the theoretical distributed algorithms community.
Two reasons for this chasm are as follows. First, an overwhelming num-
ber of applications outside the scientific computing community of users of
distributed systems are business applications for which simple models are
adequate. For example, the client\u2013server model has been firmly entrenched
with the legacy applications first developed by the Blue Chip companies (e.g.,
HP, IBM, Wang, DEC [now Compaq], Microsoft) since the 1970s and 1980s.
This model is largely adequate for traditional business applications. Second,
the state of the practice is largely controlled by industry standards, which do
not necessarily choose the \u201ctechnically best\u201d solution.
1.8.1 Distributed systems challenges from a system perspective
The following functions must be addressed when designing and building a
distributed system:
\u2022 Communication This task involves designing appropriate mechanisms
for communication among the processes in the network. Some exam-
ple mechanisms are: remote procedure call (RPC), remote object invo-
23 1.8 Design issues and challenges
cation (ROI), message-oriented communication versus stream-oriented
\u2022 Processes Some of the issues involved are: management of processes
and threads at clients/servers; code migration; and the design of software
and mobile agents.
\u2022 Naming Devising easy to use and robust schemes for names, identifiers,
and addresses is essential for locating resources and processes in a trans-
parent and scalable manner. Naming in mobile systems provides additional
challenges because naming cannot easily be tied to any static geographical
\u2022 Synchronization Mechanisms for synchronization or coordination
among the processes are essential. Mutual exclusion is the classical exam-
ple of synchronization, but many other forms of synchronization, such as
leader election are also needed. In addition, synchronizing physical clocks,
and devising logical clocks that capture the essence of the passage of time,
as well as global state recording algorithms, all require different forms of
\u2022 Data storage and access Schemes for data storage, and implicitly for
accessing the data in a fast and scalable manner across the network are
important for efficiency. Traditional issues such as file system design have
to be reconsidered in the setting of a distributed system.
\u2022 Consistency and replication To avoid bottlenecks, to provide fast access
to data, and to provide scalability, replication of data objects is highly
desirable. This leads to issues of managing the replicas, and dealing with
consistency among the replicas/caches in a distributed setting. A simple
example issue is deciding the level of granularity (i.e., size) of data access.
\u2022 Fault tolerance Fault tolerance requires maintaining correct and efficient
operation in spite of any failures of links, nodes, and processes. Process
resilience, reliable communication, distributed commit,