Using erlydtl with Cowboy

Erlydtl seems to be the most popular Erlang templating library, and using it with Cowboy is fairly simple; but doesn’t seem to be terribly well documented.

As always, I’m assuming you’re using erlang.mk and know how to set up a Cowboy project. First, you need to add erlydtl as a dependency in your Makefile:

PROJECT = cowboy_stormpath
DEPS = cowboy erlydtl
include erlang.mk

and as a dependency in your .app.src:

{application, cowboy_erlydtl, [
    {description, ""},
    {vsn, "0.1.0"},
    {id, "git"},
    {modules, []},
    {registered, []},
    {applications, [
        kernel,
        stdlib,
        cowboy,
        erlydtl
    ]},
    {mod, {cowboy_erlydtl_app, []}},
    {env, []}
]}.

Next, create a folder called “templates” in your project root, any .dtl files in here will be compiled when you run “make app”:

<html><body>Your favourite Smurf is {{ smurf_name }}.</body></html>

If you look in ebin/ you should see a file named smurfin_dtl.beam (or whatever), this is the compiled version of your template. Finally, you need a handler to render the template:

-module(smurf_handler).
-behaviour(cowboy_http_handler).

-export([init/3]).
-export([handle/2]).
-export([terminate/3]).

-record(state, {}).

init(_, Req, _Opts) ->
    {ok, Req, #state{}}.

handle(Req, State=#state{}) ->
    {ok, Body} = smurfin_dtl:render([{smurf_name, "Smurfette"}]),
    {ok, Req2} = cowboy_req:reply(200, [{<<"content-type">>, <<"text/html">>}], Body, Req),
    {ok, Req2, State}.

terminate(_Reason, _Req, _State) ->
    ok.

and add a route:

    Dispatch = cowboy_router:compile([
        {'_', [
            {"/smurfin", smurf_handler, []}
        ]}
    ]),

et voila, you’ve rendered your first template!

Testing a gen_server with eunit

Of all the strange corners of Erlang, eunit can be one of the hardest to get your head around. Particularly if you’re used to one of the XUnit frameworks.

It is very powerful, especially once you get into generating tests; but it can also be very confusing, and sometimes the error messages are not that helpful.

I wanted to write some tests for a gen_server, using a new instance for each test, and it took me a couple of attempts to get what I wanted working. It’s very important to make sure your test can actually fail, either in advance or by mutating it, or to add some output (?debugMsg) that proves it ran.

-module(foo_server_tests).

-include_lib("eunit/include/eunit.hrl").

foo_server_test_() ->
    {foreach, fun setup/0, fun cleanup/1, [
        fun(Pid) -> fun() -> server_is_alive(Pid) end end,
        fun(Pid) -> fun() -> something_else(Pid) end end
    ]}.

setup() ->
    ?debugMsg("setup"),
    process_flag(trap_exit, true),
    {ok, Pid} = foo_server:start_link(),
    Pid.

server_is_alive(Pid) ->
    ?assertEqual(true, is_process_alive(Pid)).

something_else(Pid) ->
    ?assertEqual(bar, gen_server:call(Pid, foo)).

cleanup(Pid) ->
    ?debugMsg("cleanup"),
    exit(Pid, kill), %% brutal kill!
    ?assertEqual(false, is_process_alive(Pid)).

We need to trap exits in the setup fun, as the process is linked to the gen_server when it starts. The tricky bit is the nested funs in the generator, if you try to call the “test” directly:

foo_server_test_() ->
    {foreach, fun setup/0, fun cleanup/1, [
        fun(Pid) -> server_is_alive(Pid) end
    ]}.

you’ll get an error like this:

*** result from instantiator foo_server_tests:'-foo_server_test_/0-fun-2-'/1 is not a test ***

because eunit is trying to use the output of the called fun as a test. And if you return multiple tests from one block:

foo_server_test_() ->
    {foreach, fun setup/0, fun cleanup/1, [
        fun(Pid) -> [
             fun() -> server_is_alive(Pid) end,
             fun() -> something_else(Pid) end
        ] end
    ]}.

then the setup & teardown will only be called once for those tests (which may, or may not, be what you want).

UPDATE: actually this is a better approach:

-module(foo_server_tests).

-include_lib("eunit/include/eunit.hrl").

foo_server_test_() ->
    {foreach, fun setup/0, fun cleanup/1, [
        fun server_is_alive/1,
        fun something_else/1
    ]}.

server_is_alive(Pid) ->
    fun() ->
        ?assertEqual(true, is_process_alive(Pid))
    end.

something_else(Pid) ->
    fun() ->
        ?assertEqual(bar, gen_server:call(Pid, foo))
    end.

as the test names are included in the output:

$ SKIP_DEPS=true make eunit
 APP    foo.app.src
 GEN    test-dir
 GEN    eunit
======================== EUnit ========================
directory "ebin"
  module 'foo_server'
    module 'foo_server_tests'
      foo_server_tests: server_is_alive...ok
      foo_server_tests: something_else...[0.001 s] ok
      [done in 0.012 s]
    [done in 0.012 s]
  [done in 0.028 s]
=======================================================
  All 2 tests passed.

UPDATE 2: in case it wasn’t clear, this is effectively an “integration” test or black box test. The server is started, and messages are passed to it. It is also possible to “unit” test individual methods, if you don’t mind being more tightly coupled to the internal state of the server (white box). Both kinds of testing can provide value, pick whichever works for you for that scenario.

Building a Tic Tac Toe game with Cowboy & websockets (Part 4)

Saving game state

As discussed in Part 3, if the FSM crashes it will be re-started, but the game state will be lost. Which isn’t ideal.

The easiest way to solve this is to store the state record in an Ets or Dets table (the main difference being whether it is stored on disk or just in memory). I decided to use a Dets table, although if the VM were to crash (or be stopped deliberately) the game supervisor would not restart the child FSMs when it came back up (yet, there are some other problems to solve (e.g. restoring sessions) before that would be possible).

The first step was to add the game id, and current FSM state to the state record:

-record(state, {game_id, current_state=p1_turn, p1, p2, board=#{
    <<"1,1">> => '_', <<"1,2">> => '_', <<"1,3">> => '_',
    <<"2,1">> => '_', <<"2,2">> => '_', <<"2,3">> => '_',
    <<"3,1">> => '_', <<"3,2">> => '_', <<"3,3">> => '_'
}}).

And update the init function to open the table, and check if any state already exists for that game id:

init(Args) ->
    [{P1, P2, GameId}] = Args,
    process_flag(trap_exit, true),
    true = gproc:reg({n, l, GameId}),
    {ok, t3_game_state} = dets:open_file(t3_game_state, []),
    State = get_game_state(GameId, P1, P2),
    notify_players(State),
    {ok, State#state.current_state, State}.

get_game_state(GameId, P1, P2) ->
    case dets:lookup(t3_game_state, GameId) of
        [] ->
            save_game_state(#state{game_id = GameId, p1 = P1, p2 = P2});
        [{GameId, State}] ->
            State
    end.

We also trap exits; so the terminate function of the FSM is called, and we can close the table:

terminate(_Reason, _StateName, _State) ->
    dets:close(t3_game_state).

Finally, we save the game state after any successful FSM transition:

p1_turn({play, P1, Cell}, State = #state{p1 = P1}) ->
    NewState = play(Cell, State, 'O', p2_turn),
    Res = case t3_game:has_won(NewState#state.board, 'O') of
        true ->
            game_won(NewState#state.p1, NewState#state.p2, NewState#state.board),
            {stop, normal, NewState};
        false ->
            case t3_game:is_draw(NewState#state.board) of
                true ->
                    game_drawn(NewState#state.p1, NewState#state.p2, NewState#state.board),
                    {stop, normal, NewState};
                false ->
                    notify_players(NewState),
                    {next_state, p2_turn, NewState}
            end
    end,
    save_game_state(NewState),
    Res.

save_game_state(State = #state{game_id = GameId}) ->
    ok = dets:insert(t3_game_state, {GameId, State}),
    State.

Now, if the process crashes, it will be resurrected with the state rolled back to before the offending message. Of course in this system, it’s likely that any crash will be repeated if the same message arrives at the same state; but if your application is likely to suffer from transient errors (e.g. network errors calling another system) then you can just “let it crash”.

There are, of course, some trade-offs to this solution. A Dets table is only available to one Erlang node, if you wanted to scale out it would make more sense to use replicated Mnesia or an external data store such as Postgres, or Riak. And the shared Dets table could become a bottleneck when there were many games running.

A more pressing problem would be handling versioning of the game state, this naive code would crash if the record was changed and an old version was retrieved from the data store.

Building a Tic Tac Toe game with Cowboy & websockets (Part 3)

Game FSM

Following on from Part 1, and Part 2, we’re now going to take a look at the game itself.

My first attempt at modeling the game state was using an array of arrays:

-record(state, {p1, p2, board=[
    ['_', '_', '_'],
    ['_', '_', '_'],
    ['_', '_', '_']
]).

but the lack of mutation makes that awkward to work with. My second draft used a map, keyed by {row, column} tuples:

-record(state, {p1, p2, board=#{
    {1,1} => '_', {1,2} => '_', {1,3} => '_',
    {2,1} => '_', {2,2} => '_', {2,3} => '_',
    {3,1} => '_', {3,2} => '_', {3,3} => '_'
}}).

but that blew up when being serialized to send to the client, due to the tuple keys. I could have re-formatted it before sending, but it seemed like less hassle to use the one model. So the final version just used bitstring keys:

-record(state, {p1, p2, board=#{
    <<"1,1">> => '_', <<"1,2">> => '_', <<"1,3">> => '_',
    <<"2,1">> => '_', <<"2,2">> => '_', <<"2,3">> => '_',
    <<"3,1">> => '_', <<"3,2">> => '_', <<"3,3">> => '_'
}}).

When the game is started:

init(Args) ->
    io:format("New game started: ~p~n", [Args]),
    [{P1, P2, GameId}] = Args,
    true = gproc:reg({n, l, GameId}),
    State = #state{p1 = P1, p2 = P2},
    P1 ! {your_turn, State#state.board},
    P2 ! {wait, State#state.board},
    {ok, p1_turn, State}.

It registers itself with gproc using the game id, and notifies the players of whose turn it is. There are only two states p1_turn and p2_turn:

p1_turn({play, P1, Cell}, State = #state{p1 = P1}) ->
    NewState = play(Cell, State, 'O'),
    case t3_game:has_won(NewState#state.board, 'O') of
        true ->
            game_won(NewState#state.p1, NewState#state.p2, NewState#state.board),
            {stop, normal, NewState};
        false ->
            case t3_game:is_draw(NewState#state.board) of
                true ->
                    game_drawn(NewState#state.p1, NewState#state.p2, NewState#state.board),
                    {stop, normal, NewState};
                false ->
                    notify_players(NewState#state.p2, NewState#state.p1, NewState#state.board),
                    {next_state, p2_turn, NewState}
            end
    end.

play(Cell, State, Symbol) ->
    '_' = maps:get(Cell, State#state.board),
    State#state{board = maps:update(Cell, Symbol, State#state.board)}.

notify_players(Play, Wait, Board) ->
    Play ! {your_turn, Board},
    Wait ! {wait, Board}.

game_won(Win, Lose, Board) ->
    Win ! {you_win, Board},
    Lose ! {you_lose, Board}.

game_drawn(P1, P2, Board) ->
    P1 ! {draw, Board},
    P2 ! {draw, Board}.

First we check if it was a winning move, or a draw; in either case, the game is over and the process stops normally. As it was registered as a “transient” process, it will not be restarted. Otherwise it becomes the other player’s turn. In all cases, the updated board is sent to the client.

While this is a working implementation of multi-player game of tic-tac-toe, there are still a lot of rough edges. For example, if the FSM crashes it will be restarted but the game state will be lost. An improvement would be to store it (in an ETS table, or Mnesia, or even something like Riak) after successfully processing a message. It also relies on pids in a few places, that could change if a process is restarted; it would be better to look them up in gproc using a more stable identifier. It’s also possible that the websocket connection would be interrupted, and the client would need to resync with the server.

Using Lager with Cowboy

Lager is a popular logging framework for Erlang applications. To get it working with Cowboy (assuming you’re using erlang.mk), you need to add it as a dependency to your Makefile:

DEPS = cowboy lager

and then add the parse transform to the options for erlc (as shown here):

include erlang.mk
ERLC_OPTS += +'{parse_transform, lager_transform}'

(alternatively, you could add a compile header to each file that needs it). You also need to add lager to your app.src file:

    {applications, [
        kernel,
        stdlib,
        cowboy,
        lager
    ]},
]}.

so that it is started as part of your release. You can find an example repo here.

Building a Tic Tac Toe game with Cowboy & websockets (Part 2)

In Part 1 we dealt with session management, now we’ll take a look at starting a new game.

First, we send a message from the client:

newGameBtn.onclick = function() {
    var msg = JSON.stringify({type: 'new_game', sessionId: sessionId});
    send(msg);
    newGameBtn.disabled = true;
    clearBoard();
    updateStatus('Waiting to join game...');
};

and handle that on the other side of the websocket:

websocket_handle({text, Json}, Req, State) ->
    Msg = jiffy:decode(Json, [return_maps]),
    Resp = validate_session(Msg, fun() ->
        Type = maps:get(<<"type">>, Msg),
        handle_message(Type, Msg)
    end),
    {reply, make_frame(Resp), Req, State};

validate_session(Msg, Fun) ->
    SessionId = maps:get(<<"sessionId">>, Msg),
    case gen_server:call(t3_session_manager, {validate_session, SessionId}) of
        ok -> Fun();
        invalid_session -> #{type => <<"error">>, msg=> <<"invalid_session">>}
    end.

handle_message(<<"new_game">>, Msg) ->
    SessionId = maps:get(<<"sessionId">>, Msg),
    start_new_game(SessionId);

start_new_game(_SessionId) ->
    Res = try
         gen_server:call(t3_match_maker, {find_game}, 30000)
    catch
        exit:{timeout,_} -> timeout
    end,
    case Res of
        {ok, GameId} -> #{type => <<"new_game">>, id => GameId};
        timeout -> #{type => <<"no_game_available">>}
    end.

We decode the message, and validate the session; then we need to find a game, a task we delegate to the “match maker”, another gen_server. It’s possible a game won’t be found in time, so we need to handle timeouts too.

The match maker is pretty simple, like the session manager, but might be more interesting in a real app:

handle_call({find_game}, From, State) ->
    case find_game(From, State) of
        {ok, GameId, NewState} -> {reply, {ok, GameId}, NewState};
        {wait, NewState} -> {noreply, NewState}
    end;

find_game(From, #state{waiting=[]}) ->
    {wait, #state{waiting=[From]}};

find_game({P2,_}, #state{waiting=[From|Rest]}) ->
    GameId = uuid:uuid_to_string(uuid:get_v4(), binary_standard),
    {P1,_} = From,
    {ok, _Pid} = supervisor:start_child(t3_game_sup, [{P1, P2, GameId}]),
    gen_server:reply(From, {ok, GameId}),
    {ok, GameId, #state{waiting=Rest}}.

If someone is waiting, then we start a new game; otherwise, the pid of the websocket gets pushed onto the queue (this should probably be a more stable reference, like a user id, and we would look up the pid. We also don’t handle the case that a queued socket is closed, or times out waiting).

We need to add two more items to our supervision tree:

init([]) ->
    Procs = [
        {t3_session_manager, {t3_session_manager, start_link, []}, permanent, 5000, worker, [t3_session_manager]},
        {t3_match_maker, {t3_match_maker, start_link, []}, permanent, 5000, worker, [t3_match_maker]},
        {t3_game_sup, {t3_game_sup, start_link, []}, permanent, 5000, supervisor, [t3_game_sup]}
    ],
    {ok, {{one_for_one, 1, 5}, Procs}}.

The game supervisor is also very simple:

init([]) ->
    Procs = [{t3_game_fsm, {t3_game_fsm, start_link, []}, transient, 5000, worker, [t3_game_fsm]}],
    {ok, {{simple_one_for_one, 5, 10}, Procs}}.

The important thing to note is the “simple one-for-one” restart strategy, where children are added dynamically, on demand.

We’ll look at the game FSM in more detail in Part 3, but when started it notifies both players:

init(Args) ->
    io:format("New game started: ~p~n", [Args]),
    [{P1, P2, GameId}] = Args,
    true = gproc:reg({n, l, GameId}),
    State = #state{p1 = P1, p2 = P2},
    P1 ! {your_turn, State#state.board},
    P2 ! {wait, State#state.board},
    {ok, p1_turn, State}.

Finally, we need to handle some new messages on the client:

    } else if (msg.type === 'new_game') {
        gameId = msg.id;
        updateStatus('New game!');
    } else if (msg.type === 'your_turn') {
        updateStatus('Your turn!');
        updateBoard(msg.data);
        enableBoard();
    } else if (msg.type === 'wait') {
        updateBoard(msg.data);
        updateStatus('Waiting for other player...');
    }

Building a Tic Tac Toe game with Cowboy & websockets (Part 1)

Tic-tac-toe (or noughts and crosses) is a good example project, as the game itself is pretty simple and won’t become a distraction. I’m going to try and build a version using Erlang, and more specifically Cowboy: an http server/framework.

I’m going to assume you know the basics of setting up a new project, and skip straight to the more interesting bits. The goal is to create a multi-player, websockets & html based version of tic-tac-toe. As it’s turn-based, we avoid a lot of the really thorny problems of “real-time” multi-player games.

My first step was to create a session as soon as the socket is opened:

socket.onopen = function () {
    socket.send('new_session')
}

The backend then needs to handle this frame. I decided to add a session manager as a gen_server:

-record(state, {sessions=#{}}).

handle_call(new_session, _From, State) ->
    SessionId = uuid:uuid_to_string(uuid:get_v4(), binary_standard),
    NewState = update_session(SessionId, State),
    {reply, {ok, SessionId}, NewState};

update_session(SessionId, State) ->
    Expiry = half_an_hour_from_now(),
    prune_expired_sessions(State#state{sessions=maps:put(SessionId, Expiry, State#state.sessions)}).

half_an_hour_from_now() ->
    Now = calendar:universal_time(),
    calendar:gregorian_seconds_to_datetime(calendar:datetime_to_gregorian_seconds(Now) + (30 * 60)).

prune_expired_sessions(State) ->
    SessionIds = maps:keys(State#state.sessions),
    {_, ExpiredSessions} = lists:partition(fun(S) -> session_valid(S, State#state.sessions) end, SessionIds),
    State#state{sessions=maps:without(ExpiredSessions, State#state.sessions)}.

session_valid(SessionId, Sessions) ->
    case maps:find(SessionId, Sessions) of
        error -> false;
        {ok, Expires} ->
            Expires > calendar:universal_time()
    end.

In a more realistic app, you would probably want to perform some authentication here, and use a datastore of some kind; but we’ll just create a new session and return the id. We also need to add the new server to the supervision tree:

init([]) ->
    Procs = [ 
        {t3_session_manager, {t3_session_manager, start_link, []}, permanent, 5000, worker, [t3_session_manager]}
    ],  
    {ok, {{one_for_one, 1, 5}, Procs}}.

We can then call the server from the ws handler:

websocket_handle({text, <<"new_session">>}, Req, State) ->
    Resp = start_new_session(),
    {reply, make_frame(Resp), Req, State};

start_new_session() ->
    {ok, SessionId} = gen_server:call(t3_session_manager, new_session),
    #{type => <<"new_session">>, sessionId => SessionId}.

make_frame(Msg) ->
    Json = jiffy:encode(Msg),
    {text, Json}.

We can call the server using it’s module name, rather than a pid, because it registered itself when starting:

start_link() ->
    gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).

(Watch out though, this effectively makes it a singleton, which could be a choke-point in a real app). We need to handle the response on the client:

socket.onmessage = function(ev) {
    console.log('Received data: ' + ev.data);
    var msg = JSON.parse(ev.data);

    if (msg.type === 'new_session') {
        sessionId = msg.sessionId;
        newGameBtn.disabled = false;
    }
}

And we’ll probably also need to be able to validate a session at some point:

handle_call({validate_session, SessionId}, _From, State) ->
    case session_valid(SessionId, State#state.sessions) of
        false ->
            {reply, {error, invalid_session}, State};
        true ->
            %% sliding expiry window
            NewState = update_session(SessionId, State),
            {reply, ok, NewState}
    end;

You can find the full source code here. In Part 2, we’ll look at starting a new game.