It’s a Syn (Part 4)

We are starting to get to the business end. So how can we model a netsplit? The easiest thing seems to be to add a set for each node, listing the nodes that are visible from it:

AllOtherNodes(n) ==
    Nodes \ {n}

Init ==
     ...
    /\ visible_nodes = [n \in Nodes |-> AllOtherNodes(n)]

At the start, all nodes can see all other nodes. Now, when we register a new value:

Register(n) ==
     ...
    /\ inbox' = [o \in Nodes |->
                       IF o \in visible_nodes[n] THEN
                           Append(inbox[o], [action |-> "sync_register", name |-> next_val])
                       ELSE
                           inbox[o]
                   ]

we only send messages to the nodes that are still visible (same for unregister). And we add two new state transitions:

Next ==
    /\ \E n \in Nodes:
        ...
        \/ Disconnect(n)
        \/ Reconnect(n)

To disconnect:

Disconnect(n) ==
    /\ Cardinality(visible_nodes[n]) > 0
    /\ LET other_node == CHOOSE o \in visible_nodes[n]: TRUE
        IN visible_nodes' = [o \in Nodes |-> CASE
            (o = other_node) -> visible_nodes[o] \ {n}
            [] (o = n) -> visible_nodes[o] \ {other_node}
            [] OTHER -> visible_nodes[o]
        ]
    /\ UNCHANGED <<registered, locally_registered, inbox, next_val>>

If other nodes are still visible; we select a node, at random, from the set and remove it (on both sides). And reconnection is the opposite:

Reconnect(n) ==
    /\ Cardinality(AllOtherNodes(n) \ visible_nodes[n]) > 0
    /\ LET other_node == CHOOSE o \in (AllOtherNodes(n) \ visible_nodes[n]): TRUE
        IN visible_nodes' = [o \in Nodes |-> CASE
            (o = other_node) -> visible_nodes[o] \union {n}
            [] (o = n) -> visible_nodes[o] \union {other_node}
            [] OTHER -> visible_nodes[o]
        ]
    /\ UNCHANGED <<registered, locally_registered, inbox, next_val>>

We also need to update our invariant:

AllRegistered ==
    \A n \in Nodes:
        Len(inbox[n]) = 0 /\ visible_nodes[n] = AllOtherNodes(n) => locally_registered[n] = registered

to only compare the registered values, once a netsplit has healed. If we run this model:

Error: Invariant AllRegistered is violated.
Error: The behavior up to this point is:
State 1: <Initial predicate>
/\ next_val = 0
/\ inbox = (n1 :> <<>> @@ n2 :> <<>>)
/\ locally_registered = (n1 :> {} @@ n2 :> {})
/\ registered = {}
/\ visible_nodes = (n1 :> {n2} @@ n2 :> {n1})

State 2: <Next line 78, col 5 to line 85, col 19 of module syn>
/\ next_val = 0
/\ inbox = (n1 :> <<>> @@ n2 :> <<>>)
/\ locally_registered = (n1 :> {} @@ n2 :> {})
/\ registered = {}
/\ visible_nodes = (n1 :> {} @@ n2 :> {})

State 3: <Next line 78, col 5 to line 85, col 19 of module syn>
/\ next_val = 1
/\ inbox = (n1 :> <<>> @@ n2 :> <<>>)
/\ locally_registered = (n1 :> {0} @@ n2 :> {})
/\ registered = {0}
/\ visible_nodes = (n1 :> {} @@ n2 :> {})

State 4: <Next line 78, col 5 to line 85, col 19 of module syn>
/\ next_val = 1
/\ inbox = (n1 :> <<>> @@ n2 :> <<>>)
/\ locally_registered = (n1 :> {0} @@ n2 :> {})
/\ registered = {0}
/\ visible_nodes = (n1 :> {n2} @@ n2 :> {n1})

it fails pretty quickly. But that is what we should expect, as new keys could have been added while the network was partitioned, and we aren’t doing anything to resolve that. And that is what we will be looking into, next time.

Leave a comment