Word chains (Part 2)

Previously, we laid some groundwork for generating word chains. Rather than arbitrarily returning one word, we might as well get all the words that are one letter different from the first word:

prop_next_words_should_be_near() ->
    ?FORALL({FirstWord, LastWord}, valid_words(),
        begin
            NextWords = word_chains:next_words(FirstWord),
            InvalidWords = lists:filter(fun(W) -> word_chains:get_word_distance(W, FirstWord) =/= 1 end, NextWords),
            length(InvalidWords) =:= 0
        end).

We can calculate the “word distance” using map/reduce:

get_word_distance(Word1, Word2) ->
    Differences = lists:zipwith(fun(X, Y) -> case X =:= Y of true -> 0; false -> 1 end end, Word1, Word2),
    lists:foldl(fun(D, Acc) -> Acc + D end, 0, Differences).

For each letter in Word1, we compare it with the same (position) letter in Word2, and assign a 0 if it matches and a 1 if it differs. The sum of these values tells us the difference between the 2 words.

2> word_chains:get_word_distance("cat", "cat").
0
3> word_chains:get_word_distance("cat", "cot").
1
4> word_chains:get_word_distance("cat", "cog").
2

Using this helper function, we can easily find all the possible next words:

next_words(FirstWord) ->
    WordList = word_list(),
    SameLengthWords = lists:filter(fun(W) -> length(W) =:= length(FirstWord) end, WordList),
    WordDistances = lists:map(fun(W) -> {W, get_word_distance(W, FirstWord)} end, SameLengthWords),
    lists:map(fun({Word, _}) -> Word end, lists:filter(fun({_, Distance}) -> Distance =:= 1 end, WordDistances)).

Almost there! Next time, we will actually start generating some word chains.

Word chains (Part 1)

I’ve recently been using the word chains kata for interviewing, and I thought it might be interesting to try using Erlang, and property testing.

The first step is to get a word list. I thought most linux distros came with a dictionary file, but my laptop only had a crack lib, which wasn’t really what I was looking for.

I had used this npm package before, so I just downloaded the text file it provides. With that hand, getting a list of words is easy:

word_list() ->
    {ok, Data} = file:read_file("words.txt"),
    binary:split(Data, [<<"\n">>], [global]).

The next step is to find all words that are one letter away from the first word. So we create a property:

prop_next_word_should_be_new() ->
    ?FORALL({FirstWord, LastWord}, valid_words(),
        begin
            NextWord = word_chains:next_word(FirstWord, LastWord),
            NextWord =/= FirstWord
        end).

For each first word/last word pair, we check that the next word is different from the first word. We also need a generator, of valid words:

valid_words() ->
    ?SUCHTHAT({FirstWord, LastWord},
        ?LET(N, choose(2, 10),
            begin
                WordList = word_chains:word_list(),
                SameLengthWords = lists:filter(fun(W) -> length(W) =:= N end, WordList),
                {random_word(SameLengthWords), random_word(SameLengthWords)}
            end),
    FirstWord =/= LastWord).

random_word(Words) ->
    lists:nth(rand:uniform(length(Words)), Words).

First we pick a random number, in the range (2, 10), and then pick 2 words of that length, from the full word list, at random. This could result in the same word being used as both first & last word, so we filter that out, using the ?SUCHTHAT macro.

For now, we can make this pass by simply returning the last word:

next_word(_FirstWord, LastWord) ->
    LastWord.
$ ./rebar3 proper
===> Verifying dependencies...
===> Compiling word_chains
===> Testing prop_word_chains:prop_next_word_should_be_new()
....................................................................................................
OK: Passed 100 test(s).
===> 
1/1 properties passed

Boom! Next time, a more useful implementation of next word.

Grouping by time period in Redshift

We use Redshift for reporting (or, more accurately, the tool we use for reporting uses Redshift).

It’s pretty simple to calculate daily rollups, using datetrunc:

select date_trunc('day', foo) date, sum(bar) bar
from [baz]
where ...
group by date

Or just by casting to a date:

select [foo:date] date, sum(bar) bar
from [baz]
where ...
group by date

But what if you want to slice into smaller time periods?

The easiest way I’ve found, is to use to_char, and format the timestamp appropriately. e.g.

select to_char(foo, 'YYYY-MM-DD HH24:MI') date, sum(bar) bar
from [baz]
where ...
group by date

Using “free” wi-fi on Linux

I recently found myself at a train station that claimed to offer “free super-fast wi-fi”; but when I connected to it, I was stuck with a little question mark in the gnome toolbar, and the chrome dinosaur.

I’m not the first person to note this, but I did manage to get it working… eventually.

I opened up a terminal, and even though the browser was sad, I could ping:

$ ping google.com
PING google.com (216.58.214.14) 56(84) bytes of data.
From _gateway (10.102.0.5) icmp_seq=1 Destination Net Prohibited

Sort of. DNS was working, anyway. So I tried curl:

$ curl -vL "https://www.google.com"
* Rebuilt URL to: https://www.google.com/
*   Trying 172.217.19.196...
* TCP_NODELAY set
* Connected to www.google.com (172.217.19.196) port 443 (#0)
... snip ...

And it looked like I was getting the actual Google homepage, not a hijacked page, which was interesting.

At that point I took the IP address from that request, and stuck it directly in the browser, hoping to skip straight there; and now the browser decided to show me the T&C page I needed. Once I’d ticked the right box, I was online.

Adding/removing LBaaS pool members, using Ansible

We use a load balancer to achieve “zero downtime deployments”. During a deploy, we take each node out of the rotation, update the code, and put it back in. If you’re using the Rackspace Cloud, there are ansible modules provided (as described here).

If you’re just using OpenStack though, it’s not quite as simple. While there are some ansible modules, there doesn’t seem to be one for managing LBaaS pool members. It is possible to create a pool member using a heat template, but that doesn’t really fit into an ansible playbook.

The only alternative seems to be using the neutron cli, which is deprecated (but doesn’t seem to have been replaced, for controlling a LBaaS anyway).

Fortunately, it can return output in a machine readable format (json), which makes calling it from ansible relatively simple. To add a pool member, in an idempotent fashion:

- name: Get current pool members
  local_action: command neutron lbaas-member-list test-gib-lb-pool --format json
  register: member_list

- set_fact: pool_member={{ member_list.stdout | from_json | selectattr("address", "equalto", ansible_default_ipv4.address) | map(attribute="id") | list }}

- name: Remove node from LB
  local_action: command neutron lbaas-member-delete {{ pool_member | first }} test-gib-lb-pool
  when: (pool_member | count) > 0

You first need to check if the current host is in the list of pool members. If so, remove it; otherwise, do nothing.

Once the code is updated, and the services restarted, you can put the node back in the pool:

- name: Add node to LB
  local_action: command neutron lbaas-member-create --name {{ inventory_hostname }} --subnet gamevy_subnet --address {{ ansible_default_ipv4.address }} --protocol-port 443 test-gib-lb-pool

This seems to be quite effective, but we have seen some errors when the members are deleted (which isn’t exactly “zero” downtime). If I find a better approach I’ll update this.

A squad of squids

We use squid as a forward proxy, to ensure that all outbound requests come from the same (whitelisted) IP addresses.

Originally, we chose the proxy to use at deploy time, using an ansible template:

"proxy": "http://{{ hostvars['proxy-' + (["01", "02"] | random())].ansible_default_ipv4.address }}:3128"

This “load balanced” the traffic, but wouldn’t be very useful if one of the proxies disappeared (the main reason for having two of them!)

I briefly considered using nginx to pass requests to squid, as it was already installed on the app servers; but quickly realised that wouldn’t work, for the same reason we needed to use squid in the first place: it can’t proxy TLS traffic.

A bit of research revealed that you can group multiple squid instances into a hierarchy. Taken care of by some more templating, in the squid config this time:

{% if inventory_hostname in groups['app_server'] %}
{% for host in proxy_ips %}
cache_peer {{ host }} parent 3128 0 round-robin
{% endfor %}
never_direct allow all
{% endif %}

Where the list of IPs is a projection from inventory data:

     
proxy_ips: "{{ groups['proxy_server'] | map('extract', hostvars, ['ansible_default_ipv4', 'address']) | list }}"

Withdraw!

The next transition is a withdrawal. Naively, we add another command:

open(_Data) ->
    [
        {open, {call, bank_statem, deposit, [pos_integer()]}},
        {open, {call, bank_statem, withdraw, [pos_integer()]}}
    ].

Which causes an immediate failure:

===> Testing prop_bank_statem:prop_test()
...
=ERROR REPORT==== 29-Apr-2018::15:08:41 ===
** State machine bank_statem terminating
** Last event = {{call,{,#Ref}},
                 {withdraw,2}}
** When server state  = {open,#{balance => 1}}
** Reason for termination = error:function_clause
** Callback mode = state_functions
** Stacktrace =
**  [{bank_statem,open,
                  [{call,{,#Ref}},
                   {withdraw,2},
                   #{balance => 1}],
                  [{file,"/app/_build/test/lib/bank_statem/src/bank_statem.erl"},
                   {line,46}]},
...
Crash dump is being written to: erl_crash.dump...done

Having the entire process crash isn’t very useful, so we follow the recommendation to use the TRAPEXIT macro:

prop_test() ->
    ?FORALL(Cmds, proper_fsm:commands(?MODULE),
        ?TRAPEXIT(
            begin
                bank_statem:start_link(),
                {History,State,Result} = proper_fsm:run_commands(?MODULE, Cmds),
                bank_statem:stop(),
                ?WHENFAIL(io:format("History: ~p\nState: ~p\nResult: ~p\n", [History,State,Result]),
                    aggregate(zip(proper_fsm:state_names(History), command_names(Cmds)), Result =:= ok))
            end)).

This allows shrinking to take place, and gives us (slightly) more useful output:

===> Testing prop_bank_statem:prop_test()
.!
Failed: After 2 test(s).
A linked process died with reason {function_clause,[{bank_statem,open,[{call,{,#Ref}},{withdraw,3},#{balance=>0}],[{file,[47,97,112,112,47,95,98,117,105,108,100,47,116,101,115,116,47,108,105,98,47,98,97,110,107,95,115,116,97,116,101,109,47,115,114,99,47,98,97,110,107,95,115,116,97,116,101,109,46,101,114,108]},{line,46}]},{gen_statem,call_state_function,5,[{file,[103,101,110,95,115,116,97,116,101,109,46,101,114,108]},{line,1633}]},{gen_statem,loop_event_state_function,6,[{file,[103,101,110,95,115,116,97,116,101,109,46,101,114,108]},{line,1023}]},{proc_lib,init_p_do_apply,3,[{file,[112,114,111,99,95,108,105,98,46,101,114,108]},{line,247}]}]}.

=ERROR REPORT==== 29-Apr-2018::15:11:36 ===
** State machine bank_statem terminating
** Last event = {{call,{,#Ref}},
                 {withdraw,3}}
** When server state  = {open,#{balance => 0}}
** Reason for termination = error:function_clause
** Callback mode = state_functions
** Stacktrace =
**  [{bank_statem,open,
                  [{call,{,#Ref}},
                   {withdraw,3},
                   #{balance => 0}],
                  [{file,"/app/_build/test/lib/bank_statem/src/bank_statem.erl"},
                   {line,46}]},
...
[{set,{var,1},{call,bank_statem,withdraw,[3]}}]

Shrinking (0 time(s))
[{set,{var,1},{call,bank_statem,withdraw,[3]}}]

=ERROR REPORT==== 29-Apr-2018::15:11:36 ===
** State machine bank_statem terminating
** Last event = {{call,{,#Ref}},
                 {withdraw,3}}
** When server state  = {open,#{balance => 0}}
** Reason for termination = error:function_clause
** Callback mode = state_functions
** Stacktrace =
**  [{bank_statem,open,
                  [{call,{,#Ref}},
                   {withdraw,3},
                   #{balance => 0}],
                  [{file,"/app/_build/test/lib/bank_statem/src/bank_statem.erl"},
                   {line,46}]},
....
===> 
0/1 properties passed, 1 failed
===> Failed test cases:
  prop_bank_statem:prop_test() -> false

We can see that the error is caused by trying to withdraw funds that don’t exist. This can be avoided by adding a pre-condition, to ensure that invalid commands aren’t generated:

precondition(_From, _To, #{balance:=Balance}, {call, bank_statem, withdraw, [Amount]}) ->
    Balance - Amount >= 0;

And also update our model when a withdrawal occurs:

next_state_data(_From, _To, #{balance:=Balance}=Data, _Res, {call, bank_statem, withdraw, [Amount]}) ->
    NewBalance = Balance - Amount,
    Data#{balance:=NewBalance};

At this point, I would expect the property to pass, but it’s still failing:

===> Testing prop_bank_statem:prop_test()
...................!
Failed: After 20 test(s).
A linked process died with reason {function_clause,[{bank_statem,open,[{call,{,#Ref}},{withdraw,9},#{balance=>9}],[{file,[47,97,112,112,47,95,98,117,105,108,100,47,116,101,115,116,47,108,105,98,47,98,97,110,107,95,115,116,97,116,101,109,47,115,114,99,47,98,97,110,107,95,115,116,97,116,101,109,46,101,114,108]},{line,46}]},{gen_statem,call_state_function,5,[{file,[103,101,110,95,115,116,97,116,101,109,46,101,114,108]},{line,1633}]},{gen_statem,loop_event_state_function,6,[{file,[103,101,110,95,115,116,97,116,101,109,46,101,114,108]},{line,1023}]},{proc_lib,init_p_do_apply,3,[{file,[112,114,111,99,95,108,105,98,46,101,114,108]},{line,247}]}]}.

=ERROR REPORT==== 29-Apr-2018::15:14:34 ===
** State machine bank_statem terminating
** Last event = {{call,{,#Ref}},
                 {withdraw,9}}
** When server state  = {open,#{balance => 9}}
** Reason for termination = error:function_clause
** Callback mode = state_functions
** Stacktrace =
**  [{bank_statem,open,
                  [{call,{,#Ref}},
                   {withdraw,9},
                   #{balance => 9}],
                  [{file,"/app/_build/test/lib/bank_statem/src/bank_statem.erl"},
                   {line,46}]},
...

Shrinking .
=ERROR REPORT==== 29-Apr-2018::15:14:34 ===
** State machine bank_statem terminating
** Last event = {{call,{,#Ref}},
                 {withdraw,9}}
** When server state  = {open,#{balance => 9}}
** Reason for termination = error:function_clause
** Callback mode = state_functions
** Stacktrace =
**  [{bank_statem,open,
                  [{call,{,#Ref}},
                   {withdraw,9},
                   #{balance => 9}],
                  [{file,"/app/_build/test/lib/bank_statem/src/bank_statem.erl"},
                   {line,46}]},
...
.
=ERROR REPORT==== 29-Apr-2018::15:14:34 ===
** State machine bank_statem terminating
** Last event = {{call,{,#Ref}},
                 {withdraw,9}}
** When server state  = {open,#{balance => 9}}
** Reason for termination = error:function_clause
** Callback mode = state_functions
** Stacktrace =
**  [{bank_statem,open,
                  [{call,{,#Ref}},
                   {withdraw,9},
                   #{balance => 9}],
                  [{file,"/app/_build/test/lib/bank_statem/src/bank_statem.erl"},
                   {line,46}]},
...
(2 time(s))

=ERROR REPORT==== 29-Apr-2018::15:14:34 ===
** State machine bank_statem terminating
** Last event = {{call,{,#Ref}},
                 {withdraw,9}}
** When server state  = {open,#{balance => 9}}
** Reason for termination = error:function_clause
** Callback mode = state_functions
** Stacktrace =
**  [{bank_statem,open,
                  [{call,{,#Ref}},
                   {withdraw,9},
                   #{balance => 9}],
                  [{file,"/app/_build/test/lib/bank_statem/src/bank_statem.erl"},
                   {line,46}]},
...
[{set,{var,1},{call,bank_statem,deposit,[2]}},{set,{var,2},{call,bank_statem,deposit,[7]}},{set,{var,3},{call,bank_statem,withdraw,[9]}}]
===> 
0/1 properties passed, 1 failed
===> Failed test cases:
  prop_bank_statem:prop_test() -> false

Rather embarrassingly, that’s an actual bug in my code, the guard is checking that the remaining balance is strictly greater than 0. Hurray for property testing!