Last time, we looked at taking a booking. Now we want to persist that, over a server restart. We are going to do something very simple, using DETS; obviously for a more realistic scenario you might want to store your data in an RDBMS, or Kafka, or even a distributed ledger (Raft/Paxos).
First, we need to open the file, during init:
{ok, _Name} = dets:open_file(bookings, []),
We can then save the command, after validation:
handle_call({book_room, Cmd}, _From, #{available_rooms:=AvailableRooms, version:=Version} = State) ->
...
NewVersion = Version + 1,
ok = dets:insert(bookings, {NewVersion, Cmd}),
...
Now, if the process dies, and is respawned; we can resurrect the state, by replaying all the saved commands:
init(unused) ->
...
MatchSpec = ets:fun2ms(fun({N,Cmd}) when N >= 0 -> {N, Cmd} end),
ExistingBookings = lists:sort(fun({A,_}, {B,_}) -> A =< B end, dets:select(bookings, MatchSpec)),
NewBookings = lists:foldl(fun({_, Cmd}, B) -> add_new_booking(Cmd, B) end, maps:get(bookings, State), ExistingBookings),
This isn’t exactly “event sourcing“, but something more akin to the Write Ahead Log (WAL) that databases use. We could just load all the history (which is basically what is happening right now), but the match spec:
[{{'$1','$2'},[{'>','$1',{const,0}}],[{{'$1','$2'}}]}]
will come in handy. We can now stop & start the server, and as long as the file on disk remains, any history will survive. To make things more efficient, we can save a snapshot of the state, at regular intervals; and only replay any events newer than that:
init(unused) ->
{ok, _} = dets:open_file(snapshot, []),
Results = dets:lookup(snapshot, latest),
State = case length(Results) of
0 ->
new_state();
1 ->
[{latest, S}] = Results,
S
end,
LatestVersion = maps:get(version, State),
MatchSpec = ets:fun2ms(fun({N,Cmd}) when N > LatestVersion -> {N, Cmd} end),
...
erlang:send_after(60 * 1000, self(), save_snapshot),
...
handle_info(save_snapshot, State) ->
ok = dets:insert(snapshot, {latest, State}),
{noreply, State};
This isn’t without risk. The main benefit of the “let it crash” philosophy is that any bad state is blown away, so we would need to take into account that we could attempt to replay a poison message; particularly if e.g. the actual hotel availability was from an external system, so any action might not be idempotent. We might need to discard some commands, or even perform a compensating action.