Semantics Exactly Once In Distributed System

Nilesh Jadav
9y
2.6k
0
1

Article

In this part of the tutorial, we will understand what it means to have semantics exactly once in a distributed system. To understand the topic given above, we will get help from a simple Client Server model in communication.

General scenario of communication between client and Server takes three entities as a sender, receiver and communication channel. Based on these three parameters, sender and receiver communicates. Of course, there are several other links up in between client and Server as addressing, communication model, remote or local communication and many more but those three entities are the main one.

It works as

Client sends some message or request to the Server.
Server receives the request and it sends the acknowledgement.
Server processes the request and sends the appropriate reply for the concerned request.
If the request gets lost clients time-out, if the client has to retransmit the request.

Let’s say Client is sending a message to the Server over an unreliable connection. Server receives the message and sends an acknowledgement to the client, so that the client can be assured about its message is received and is referred by the Server.

If somehow the Server has not read the message, additional mechanism i.e. Timeout has shown up in the topic. Now, with the timeout, when the client sends a message to the Server, it sets a timer to go off after a certain specific period of time.

If in that time, no acknowledgment has been received from the Server, then the client assumes or concludes that the request or the message is lost. Therefore the client performs a Retransmission to send the same request to the Server. To do it, the client must maintain a buffer, where he keeps a copy of the request message, which is needed in such retransmission cases.
Distributed System

Let’s consider a case, where the original client request is not lost, but instead the Server side ACK has been lost. Now, as per the client perspective; the case is really the same. He will set the time or retry the request. From the Server's perspective, the case has really worsened and is very different, because as ACK is not received to the client, the client simply assumes that the message is lost. Hence retransmission takes place but you see that ACK is lost. Now, for the same message, the Server receives two messages of the same type.
Distributed System

Now, this is unreal and a drastic problem in a computer network, a reliable communication says that the message is received exactly once to the receiver, but here you can see the message is received two times.

To handle such duplicate message scenarios, the sender and receiver both have some unique identifier by which they can know and track whether this message is already read or not. What the receiver can do in duplicated message cases is he simply ACKs the message, but it never passes that message to the Application or the real client, which wants that data.

Hence the sender receives ACK, but the message is not received two times to the Application. Therefore, it preserved the semantics exactly once.

I hope you like it. Thank you for reading.