Minimal web app with cowboy & rebar3

The erlang docker images now include rebar3, so it’s very easy to get started with a new project (the name needs to be an erlang term, so underscores not dashes!):

$ docker run -v $PWD:/app/foo_app -w /app erlang:25 rebar3 new release foo_app
Unable to find image 'erlang:25' locally
25: Pulling from library/erlang
e756f3fdd6a3: Pull complete 
...
Digest: sha256:4eafc58e4475a7be2416af55ea142a7cd00c14b6ec2490a38db3a0869efde7e4
Status: Downloaded newer image for erlang:25
===> Writing foo_app/apps/foo_app/src/foo_app_app.erl
===> Writing foo_app/apps/foo_app/src/foo_app_sup.erl
===> Writing foo_app/apps/foo_app/src/foo_app.app.src
===> Writing foo_app/rebar.config
===> Writing foo_app/config/sys.config
===> Writing foo_app/config/vm.args
===> Writing foo_app/.gitignore
===> Writing foo_app/LICENSE
===> Writing foo_app/README.md

As dockerd is running as root, you then need to chown the generated files (you can run the docker cmd as the current user, but that didn’t go well when I tried it).

Then you need a Dockerfile:

# Build stage 0
FROM erlang:25-alpine

RUN apk add --no-cache git

# Set working directory
RUN mkdir /buildroot
WORKDIR /buildroot

# Copy our Erlang test application
COPY rebar.config .
COPY apps/ apps/
COPY config/ config/

# And build the release
RUN rebar3 as prod release

# Build stage 1
FROM alpine

# Install some libs
RUN apk add --no-cache openssl ncurses-libs libstdc++

# Install the released application
COPY --from=0 /buildroot/_build/prod/rel/foo_app /foo_app

# Expose relevant ports
EXPOSE 8080

CMD ["/foo_app/bin/foo_app", "foreground"]

At this point, you should be able to build the image:

$ docker build -t foo_app .
Sending build context to Docker daemon  104.4kB
Step 1/13 : FROM erlang:25-alpine
25-alpine: Pulling from library/erlang
2408cc74d12b: Already exists 
1e90e213ba89: Pull complete 
Digest: sha256:1fdd18a383206eeba257f18c5dd22d7f381942eda4cd76a88e36a1d3247c4130
Status: Downloaded newer image for erlang:25-alpine
 ---> 8f1437fc7749
Step 2/13 : RUN apk add --no-cache git
 ---> Running in 1af099a684a7
fetch https://dl-cdn.alpinelinux.org/alpine/v3.16/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.16/community/x86_64/APKINDEX.tar.gz
(1/6) Installing brotli-libs (1.0.9-r6)
(2/6) Installing nghttp2-libs (1.47.0-r0)
(3/6) Installing libcurl (7.83.1-r1)
(4/6) Installing expat (2.4.8-r0)
(5/6) Installing pcre2 (10.40-r0)
(6/6) Installing git (2.36.1-r0)
Executing busybox-1.35.0-r13.trigger
OK: 23 MiB in 30 packages
Removing intermediate container 1af099a684a7
 ---> 99d4529dc490
Step 3/13 : RUN mkdir /buildroot
 ---> Running in 47daa05e777b
Removing intermediate container 47daa05e777b
 ---> ac097064f098
Step 4/13 : WORKDIR /buildroot
 ---> Running in 51479ca5c35a
Removing intermediate container 51479ca5c35a
 ---> 649f740b900b
Step 5/13 : COPY rebar.config .
 ---> 3fe63710afa6
Step 6/13 : COPY apps/ apps/
 ---> 91e46ef9dfc9
Step 7/13 : COPY config/ config/
 ---> 4194973d4e23
Step 8/13 : RUN rebar3 as prod release
 ---> Running in 7e7a2112d86e
===> Verifying dependencies...
===> Analyzing applications...
===> Compiling foo_app
===> Assembling release foo_app-0.1.0...
===> Release successfully assembled: _build/prod/rel/foo_app
Removing intermediate container 7e7a2112d86e
 ---> 91154e8634db
Step 9/13 : FROM alpine
latest: Pulling from library/alpine
2408cc74d12b: Already exists 
Digest: sha256:686d8c9dfa6f3ccfc8230bc3178d23f84eeaf7e457f36f271ab1acc53015037c
Status: Downloaded newer image for alpine:latest
 ---> e66264b98777
Step 10/13 : RUN apk add --no-cache openssl ncurses-libs libstdc++
 ---> Running in 935857dec575
fetch https://dl-cdn.alpinelinux.org/alpine/v3.16/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.16/community/x86_64/APKINDEX.tar.gz
(1/5) Installing libgcc (11.2.1_git20220219-r2)
(2/5) Installing libstdc++ (11.2.1_git20220219-r2)
(3/5) Installing ncurses-terminfo-base (6.3_p20220521-r0)
(4/5) Installing ncurses-libs (6.3_p20220521-r0)
(5/5) Installing openssl (1.1.1o-r0)
Executing busybox-1.35.0-r13.trigger
OK: 9 MiB in 19 packages
Removing intermediate container 935857dec575
 ---> e62baafed2cd
Step 11/13 : COPY --from=0 /buildroot/_build/prod/rel/foo_app /foo_app
 ---> d768a4b774ea
Step 12/13 : EXPOSE 8080
 ---> Running in 11f013006a16
Removing intermediate container 11f013006a16
 ---> 0e347e1c2d46
Step 13/13 : CMD ["/foo_app/bin/foo_app", "foreground"]
 ---> Running in 7bf493256ded
Removing intermediate container 7bf493256ded
 ---> dc0f1632d27f
Successfully built dc0f1632d27f
Successfully tagged foo_app:latest

This is a good approach for building a final deployable image, but it’s a bit painful for local development, and doesn’t make the best use of the docker layer caching. You probably want to use a mounted vol instead.

You should now be able to run the application:

$ docker run -p 8080:8080 --rm foo_app
Exec: /foo_app/erts-13.0.1/bin/erlexec -noinput +Bd -boot /foo_app/releases/0.1.0/start -mode embedded -boot_var SYSTEM_LIB_DIR /foo_app/lib -config /foo_app/releases/0.1.0/sys.config -args_file /foo_app/releases/0.1.0/vm.args -- foreground
Root: /foo_app
/foo_app

Unfortunately Ctrl+C doesn’t work, so you need to docker kill it.

This isn’t a web app, yet, so we need to add cowboy to the dependencies list in rebar.config:

{deps, [
    {cowboy,"2.9.0"}
]}.

I’m using the latest tag, but you might want a specific version. If you build again, you should see some extra steps:

Step 8/13 : RUN rebar3 as prod release
 ---> Running in 95791181399b
===> Verifying dependencies...
===> Fetching cowboy v2.9.0
===> Fetching cowlib v2.11.0
===> Fetching ranch v1.8.0
===> Analyzing applications...
===> Compiling cowlib
===> Compiling ranch
===> Compiling cowboy
===> Analyzing applications...
===> Compiling foo_app
===> Assembling release foo_app-0.1.0...
===> Release successfully assembled: _build/prod/rel/foo_app

You also need to add cowboy to the list in the app.src file:

  {applications,
   [kernel,
    stdlib,
    cowboy
   ]},

You can then configure cowboy, in your _app file:

start(_StartType, _StartArgs) ->
    Port = 8080,
    Dispatch = cowboy_router:compile([
        {'_', [{"/ping", ping_handler, []}]}
    ]), 
    {ok, _} = cowboy:start_clear(my_http_listener,
        [{port, Port}],
        #{env => #{dispatch => Dispatch}}
    ), 
    foo_app_sup:start_link().

And create a handler:

-module(ping_handler).

-export([init/2]).

init(Req, State) ->
    {ok, Req, State}.

This is the absolute minimum, and will return a 204 for any request:

$ curl -I -XGET "http://localhost:8080/ping"
HTTP/1.1 204 No Content

Or, for something a bit more realistic:

init(#{method := Method} = Req, _State) ->
    handle_req(Method, Req).

handle_req(<<"GET">>, Req) ->
    {ok, text_plain(Req, <<"pong">>)};

handle_req(_Method, Req) ->
    cowboy_req:reply(404, Req).

text_plain(Request, ResponseBody) ->
    ResponseHeaders = #{
        <<"content-type">> => <<"text/plain">>
    },--
    cowboy_req:reply(200, ResponseHeaders, ResponseBody, Request).

This will return some text, for a GET:

$ curl "http://localhost:8080/ping"
pong

And a 404 for any other HTTP method:

$ curl -I -XPOST "http://localhost:8080/ping"
HTTP/1.1 404 Not Found
content-length: 0
date: Fri, 17 Jun 2022 11:54:35 GMT
server: Cowboy

mermaid + gist = success

GitHub recently added mermaid rendering to their markdown dialect, making it simple to render your diagrams, whether in a readme or just thrown together in a gist.

As soon as you create a new gist (remembering to use the .md extension, to enable preview), you can easily draw a sequence diagram:

```mermaid
sequenceDiagram
    Alice->>John: Hello John, how are you?
    John-->>Alice: Great!
    Alice-)John: See you later!
```

or finite automata:

```mermaid
stateDiagram-v2
    state fork_state <<fork>>
      [*] --> fork_state
      fork_state --> State2
      fork_state --> State3

      state join_state <<join>>
      State2 --> join_state
      State3 --> join_state
      join_state --> State4
      State4 --> [*]
```

Using docker, instead of virtualenv

If you fancy a change, you just need a Dockerfile:

FROM python:3.7

RUN pip install pytest
COPY requirements.txt  .
RUN  pip3 install -r requirements.txt

and a pytest.ini, in the root (don’t ask):

[pytest]
pythonpath = .

Then you can build the image:

docker build -t foo .

And run the tests:

docker run -it --rm -v $PWD:/app -w /app foo pytest tests/

Or run the app locally, e.g. a lambda func:

docker run -it --rm -v $PWD:/app -w /app -e PGHOST=... -e PGUSER=... -e PGPASSWORD=... foo python -c 'import app; app.bar(None, None)'

Is it better? Probably not, you’re just swapping one set of problems for a different set 🤷

Only upload changed files to (a different) s3 bucket

We have a PR build that uploads the generated (html) output to a public s3 bucket, so you can check the results before merging. This is useful, but the output has grown over time, and is now ~6GB; so the job takes a long time to run, and uploads a lot of unnecessary files.

I recently switched the trunk build to use sync from the AWS CLI (rather than s3cmd), which was noticeably faster; so I thought I’d try using the --dry-run feature, to generate a diff against the production bucket.

docker run --rm -v $PWD:/app -w /app -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY amazon/aws-cli s3 sync output/ s3://foo-prod --dryrun --size-only

Unfortunately, there’s no machine readable output options for that command, so we need to get our awk on. My first attempt was to generate a cp command, for each line:

docker run ... | awk '{sub(/output\//, ""); sub(/&/, "\\\\&"); print "docker run --rm -v $PWD:/app -w /app -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY amazon/aws-cli s3 cp output/"$3" s3://foo-pr/"ENVIRON["GIT_COMMIT"]"/"$3}'

Once you’re satisfied the incantation looks correct, you can pipe the whole lot to bash:

docker run ... | awk ... | bash

With this working locally, it seemed simple to just run that command as a pipeline step. It was not. Trying to escape the combination of quotes in groovy proved fruitless, and in the end I just threw in a bash script, and called that from the Jenkinsfile.

While this solved one problem:

The build now took nearly twice as long to run, presumably due to copying files one at a time. I was considering using the SDK, when I realised I could just copy the needed files locally, and sync that folder instead.
mkdir changed
docker run --rm -v $PWD:/app -w /app -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY amazon/aws-cli s3 sync output/ s3://foo-prod --dryrun --size-only | awk '{sub(/&/, "\\\\&"); print "cp "$3" changed/"}' | bash
docker run --rm -v $PWD:/app -w /app -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY amazon/aws-cli s3 sync changed/ s3://foo-pr/$GIT_COMMIT --size-only

Finally, a build that is both quick(er), and uploads only the changed files!

Filtering a CSV in R

I have a 2M line CSV (exported from Redshift), and needed to do some sanity checking. In SQL I would have written something like this:

select count(*)
from foo
where date >= '2020-10-1' and date < '2020-11-1'
and foo = 'bar'

So what’s the equivalent in R?

The recommendation seems to be to use data.table

install.packages('data.table')
library(data.table)
data <- fread("foo.csv")
str(data)

Filtering by value is easy:

foo <- data[foo == 'bar']

But a date range is a little trickier. R seems to know that the strings are a date format:

POSIXct, format: "2020-05-21 14:16:24" "2020-05-21 14:16:28" ...

I imagine it’s possible to truncate those values, but the easiest thing for me was to add a new col:

foo$date <- as.Date(totk$started_at)

and then use that with subset:

> nrow(subset(foo, date >= "2020-10-1" & date < "2020-11-1"))
[1] 73594

Using dependencies as a test oracle

I have long been a disciple of what is sometimes known as the “London school” of TDD (or “outside in” design), but I like to think I’m open to alternatives, when proven useful.

With that in mind, I found James Shore’s testing without mocks series very interesting. While I’m not quite ready to dive in at the deep end of that approach, one of the reasons to mock your dependencies (other than avoiding IO) is to remove the complexity from your tests, and James offers a handy alternative.

beforeEach(function() {
    dep1 = sinon.stub().resolves();
    ...
    handler = new Handler(dep1, dep2, dep3);
});

Rather than using [insert favourite mocking library] to represent those dependencies, and risking the slippage that can occur when the real version changes, but the tests are not updated (if you haven’t got contract tests for everything); you can use the real object (ideally, some pure “business” function) both in the set up, and also in your assertions, as a “test oracle“.

beforeEach(function() {
    dep1 = new Dep1();
    ...
});

it("should return the expected flurble", function() {
    ...
    const res = await handler.handle(req);

    expect(res.flurble).to.equal(dep1.bar(req.foo));
});

This way, if the implementation of the dependency changes, the test should still pass; unless it would actually affect the SUT.

I’m sure this approach comes with its own tradeoffs, and won’t help you with anything other than simple dependencies, but it can be useful in situations where you would like to use the real dependency and still keep the tests relatively simple.

(This is probably another force pushing in the direction of a ports and adapters architecture (or impure-pure sandwich), allowing you to use “sociable” tests in the kernel, and narrow integration tests at the edges.)

RDS Postgresql WalWriteLock

We recently had a service degradation/outage, which manifested as WalWriteLock in perf insights:

The direct cause was autovacuum on a large (heavily updated) table, but it had run ~1 hour earlier, without any issues.

Our short term solution was to raise the av threshold, and kill the process. But it had to run again, at some point, or we’d be in real trouble.

We checked the usual suspects for av slowdown, but couldn’t find any transaction older than the av process itself, or any abandoned replication slots.

We don’t currently have a replica (although we are using multi-AZ); but this prompted us to realise that we still had the wal_level set to logical, after using DMS to upgrade from pg 10 to 11. This generates considerably more WAL, than the next level down.

After turning that off (and failing over), we triggered AV again, but were still seeing high WalWriteLock contention. Eventually, we found this 10 year old breadcrumb on the pg-admin mailiing list:

Is it vacuuming a table which was bulk loaded at some time in the past? If so, this can happen any time later (usually during busy periods when many transactions numbers are being assigned)

https://www.postgresql.org/message-id/4DBFF5AE020000250003D1D9%40gw.wicourts.gov

So it seems like this was a little treat left for us by DMS, which combined with the extra WAL from logical, was enough to push us over the edge at a busy time.

Once that particular AV had managed to complete, the next one was back to normal.

Triggering a cron lambda

Once you have a lambda ready to run, you need an EventBridge rule to trigger it:

docker run --rm -it -v ~/.aws:/root/.aws -v $PWD:/data -w /data -e AWS_PROFILE amazon/aws-cli events put-rule --name foo --schedule-expression 'cron(0 4 * * ? *)'

You can either run it at a regular rate, or at a specific time.

And your lambda needs the right permissions:

aws-cli lambda add-permission --function-name foo --statement-id foo --action 'lambda:InvokeFunction' --principal events.amazonaws.com --source-arn arn:aws:events:region:account:rule/foo

Finally, you need a targets file:

[{
    "Id": "1",
    "Arn": "arn:aws:lambda:region:account:function:foo"
}]

to add to the rule:

aws-cli events put-targets --rule foo --targets file://targets.json

Cron lambda (Python)

For a simple task in Redshift, such as refreshing a materialized view, you can use a scheduled query; but sometimes you really want a proper scripting language, rather than SQL.

You can use a docker image as a lambda now, but I still find uploading a zip easier. And while it’s possible to set up the db creds as env vars, it’s better to use temp creds:

import boto3
import psycopg2

def handler(event, context):
    client = boto3.client('redshift')

    cluster_credentials = client.get_cluster_credentials(
        DbUser='user',
        DbName='db',
        ClusterIdentifier='cluster',
    )

    conn = psycopg2.connect(
        host="foo.bar.region.redshift.amazonaws.com",
        port="5439",
        dbname="db",
        user=cluster_credentials["DbUser"],
        password=cluster_credentials["DbPassword"],
    )

    with conn.cursor() as cursor:
        ...

Once you have the bundle ready:

pip install -r requirements.txt -t ./package
cd package && zip -r ../foo.zip . && cd ..
zip -g foo.zip app.py

You need a trust policy, to allow lambda to assume the role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "sts:AssumeRole",
            "Principal": {
                "Service": "lambda.amazonaws.com"
            },
            "Effect": "Allow",
            "Sid": ""
        }
    ]
}

And a policy for the redshift creds:

{
    "Version": "2012-10-17",
    "Statement": [{
        "Sid": "GetClusterCredsStatement",
        "Effect": "Allow",
        "Action": [
            "redshift:GetClusterCredentials"
        ],
        "Resource": [
            "arn:aws:redshift:region:account:dbuser:cluster/db",
            "arn:aws:redshift:region:account:dbname:cluster/db"
        ]
    }]
}

In order to create an IAM role:

docker run --rm -it -v ~/.aws:/root/.aws -v $PWD:/data -w /data -e AWS_PROFILE amazon/aws-cli iam create-role --role-name role --assume-role-policy-document file://trust-policy.json
docker run --rm -it -v ~/.aws:/root/.aws -v $PWD:/data -w /data -e AWS_PROFILE amazon/aws-cli iam attach-role-policy --role-name role --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
docker run --rm -it -v ~/.aws:/root/.aws -v $PWD:/data -w /data -e AWS_PROFILE amazon/aws-cli iam attach-role-policy --role-name remove-duplicates --policy-arn arn:aws:iam::aws:policy/service-role/AWSXRayDaemonWriteAccess
docker run --rm -it -v ~/.aws:/root/.aws -v $PWD:/data -w /data -e AWS_PROFILE amazon/aws-cli iam put-role-policy --role-name role --policy-name GetClusterCredentials --policy-document file://get-cluster-credentials.json

And, finally, the lambda itself:

docker run --rm -it -v ~/.aws:/root/.aws -v $PWD:/data -w /data -e AWS_PROFILE amazon/aws-cli lambda create-function --function-name foo --runtime python3.7 --zip-file fileb://foo.zip --handler app.handler --role arn:aws:iam::account:role/role --timeout 900

If you need to update the code, after:

docker run --rm -it -v ~/.aws:/root/.aws -v $PWD:/data -w /data -e AWS_PROFILE amazon/aws-cli lambda update-function-code --function-name foo --zip-file fileb://foo.zip

You can test the lambda in the console. Next time, we’ll look at how to trigger it, using EventBridge.