It’s a Syn (Part 2)

To make things more interesting, we need to take the simple model, and run it with multiple concurrent nodes:

CONSTANTS
Nodes = {n1, n2}

Just two, to start with. The model state needs to become more complex:

Init ==
    /\ inbox = [n \in Nodes |-> <<>>]
    /\ registered = [n \in Nodes |-> {}]
    /\ next_val = 0
    /\ added = 0
    /\ removed = 0

Each node now has its own set of registered values; and, this being an Erlang system, it makes sense to model the inbox for messages (a sequence, for each node). In this case, a node is really a process (or gen_server).

Next ==
    /\ \E n \in Nodes:
        \/ Register(n)
        \/ SyncRegister(n)
        \/ Unregister(n)
        \/ SyncUnregister(n)
        \/ Complete

There are more options, for the next state: we now pick a node from the set of all nodes, and apply an action to that. Also, register & unregister have split in two (async).

Register(n) ==
    /\ \A o \in Nodes: Len(inbox[o]) = 0
    /\ next_val < MaxValues
    /\ registered' = [registered EXCEPT![n] = registered[n] \union {next_val}]
    /\ next_val' = next_val + 1
    /\ added' = added + 1
    /\ inbox' = [o \in Nodes |-> IF o = n THEN inbox[o] ELSE Append(inbox[o], [action |-> "sync_register", name |-> next_val])]
    /\ UNCHANGED <<removed>>

First we update the local registry, then broadcast a message to all other nodes, to do the same.

SyncRegister(n) ==
    /\ Len(inbox[n]) > 0
    /\ Head(inbox[n]).action = "sync_register"
    /\ registered' = [registered EXCEPT![n] = registered[n] \union {Head(inbox[n]).name}]
    /\ inbox' = [inbox EXCEPT![n] = Tail(inbox[n])]
    /\ UNCHANGED <<removed, added, next_val>>

To make things simpler, I decided that a pre-condition for register would be that all nodes have empty inboxes (i.e. any previous messages have already been processed). This obviously isn’t realistic, and the constraint will be removed later.

ItemToRemove(n) ==
    CHOOSE r \in registered[n]: TRUE

Unregister(n) ==
    /\ \A o \in Nodes: Len(inbox[o]) = 0
    /\ Cardinality(registered[n]) > 0
    /\ LET item_to_remove == ItemToRemove(n)
        IN registered' = [registered EXCEPT![n] = registered[n] \ {item_to_remove}]
        /\ inbox' = [o \in Nodes |-> IF o = n THEN inbox[o] ELSE Append(inbox[o], [action |-> "sync_unregister", name |-> item_to_remove])]
    /\ removed' = removed + 1
    /\ UNCHANGED <<added, next_val>>

SyncUnregister(n) ==
    /\ Len(inbox[n]) > 0
    /\ Head(inbox[n]).action = "sync_unregister"
    /\ registered' = [registered EXCEPT![n] = registered[n] \ {Head(inbox[n]).name}]
    /\ inbox' = [inbox EXCEPT![n] = Tail(inbox[n])]
    /\ UNCHANGED <<removed, added, next_val>>

Unsurprisingly, unregister is still a mirror image of the same steps. We maintain the same invariant:

AllRegistered ==
    \A n \in Nodes:
        Len(inbox[n]) = 0 => Cardinality(registered[n]) = (added - removed)

but evaluated on each node, individually; and we don’t bother checking until all messages have been processed (as the counters obviously won’t match).

And now we also check that all messages have been consumed, when the system halts:

PROPERTIES
AllMessagesProcessed

AllMessagesProcessed ==
    \A n \in Nodes:
        <>(Len(inbox[n]) = 0)

Again, we can run this with the model checker:

Starting...
Implied-temporal checking--satisfiability problem has 2 branches.
Computing initial states...
Finished computing initial states: 1 distinct state generated at 2024-01-05 13:15:50.
Progress(41) at 2024-01-05 13:15:50: 543 states generated, 286 distinct states found, 0 states left on queue.
Checking 2 branches of temporal properties for the complete state space with 572 total distinct states at (2024-01-05 13:15:50)
Finished checking temporal properties in 00s at 2024-01-05 13:15:50
Model checking completed. No error has been found.
  Estimates of the probability that TLC did not check all reachable states
  because two distinct states had the same fingerprint:
  calculated (optimistic):  val = 4.0E-15
543 states generated, 286 distinct states found, 0 states left on queue.
The depth of the complete state graph search is 41.

And the number of possible states has ballooned from the simple model, but still no errors!

We can also try registering a larger number of keys, or using more nodes, with the same result; but the search space, and runtime, can increase dramatically (this is the downside to exhaustive checking, vs example based testing or even generative random testing).

Leave a comment