Messaging is one source of replica non-determinism in a
fault-tolerant distributed system, because messages can
be received by the replicas in different orders, due to loss
of messages and retransmissions, delays in the network,
etc. To maintain strong replica consistency, messages
must be delivered to the replicas in the same order.
Another source of replica non-determinism is multi-
threading. If two threads within a replica share data, they
must claim and release mutexes that protect that shared
data. However, the threads in two replicas will most likely
run at slightly different speeds. In one replica, one thread
might be the first to claim a mutex and, in another replica,
a different thread might be the first to claim the mutex. To
maintain strong replica consistency, mutexes must be
granted to the threads within the replicas in the same
order.
Other sources of replica non-determinism include
operating system functions that return values local to the
processor on which they are executed, such as rand() and
gettimeofday(), or inputs for the replicas from different
redundant sources, or system exceptions due to, say, lack
of memory on one of the processors. These sources of
replica non-determinism must be sanitized, so that all of
the replicas see the same values of the functions, the same
inputs from the redundant sources, and the same system
exceptions. Such virtual determinism must be provided for
the replicas, regardless of which kind of replication is
used.
أضف تعليق:
0 comments: