Error from postgresql_user

We have an ansible task that creates a postgres user (role) for vagrant:

- name: Create vagrant user
  sudo: true
  sudo_user: postgres
  postgresql_user: name=vagrant role_attr_flags=CREATEDB,CREATEUSER

which was working fine with ansible 1.9; but when we upgraded to 2.0, we started getting an error if the user already existed.

TASK [pg-vagrant : Create vagrant user] ****************************************
fatal: [default]: FAILED! => {"changed": false, "failed": true, "module_stderr": "", "module_stdout": "\r\nTraceback (most recent call last):\r\n  File \"/tmp/ansible-tmp-1457016468.17-9802839733620/postgresql_user\", line 2722, in \r\n    main()\r\n  File \"/tmp/ansible-tmp-1457016468.17-9802839733620/postgresql_user\", line 621, in main\r\n    changed = user_alter(cursor, module, user, password, role_attr_flags, encrypted, expires, no_password_changes)\r\n  File \"/tmp/ansible-tmp-1457016468.17-9802839733620/postgresql_user\", line 274, in user_alter\r\n    if current_role_attrs[PRIV_TO_AUTHID_COLUMN[role_attr_name]] != role_attr_value:\r\n  File \"/usr/lib/python2.7/dist-packages/psycopg2/extras.py\", line 144, in __getitem__\r\n    x = self._index[x]\r\nKeyError: 'rolcreateuser'\r\n", "msg": "MODULE FAILURE", "parsed": false}

When I checked that the user had been created successfully the first time I provisioned:

=# \dg
                                 List of roles
    Role name     |                   Attributes                   | Member of
------------------+------------------------------------------------+-----------
 postgres         | Superuser, Create role, Create DB, Replication | {}
 vagrant          | Superuser, Create DB                           | {}

I noticed that the role actually had Superuser privileges, something the documentation confirmed:

CREATEUSER
NOCREATEUSER
These clauses are an obsolete, but still accepted, spelling of SUPERUSER and NOSUPERUSER. Note that they are not equivalent to CREATEROLE as one might naively expect!

So it looks like somewhere in the toolchain (ansible, psycopg2, postgres) this is no longer supported. Substituting SUPERUSER for CREATEUSER fixed the issue.

Ansible templates and urlencode

If you use ansible templates to generate your config, and strong passwords, chances are that you know to use urlencode:

{
    "postgres": "postgres://{{ db_user }}:{{ db_password|urlencode() }}@{{ db_server }}/{{ db_name }}",
}

to ensure that any special chars are not mangled. Unfortunately, as I discovered to my cost, the jinja filter does NOT encode a forward slash as %2F.

apt update_cache with ansible

Installing 3rd party software, e.g. elasticsearch, sometimes involves adding an apt repo:

- name: Add apt repository
  apt_repository: repo='deb https://example.com/apt/example jessie main'

Once this has been added, it’s necessary to call apt-get update before the new software can be installed. It’s tempting to do so by adding update_cache=yes to the apt call:

- name: Install pkg
  apt: name=example update_cache=yes

But a better solution is to separate the two:

- name: Add apt repository
  apt_repository: repo='deb https://example.com/apt/example jessie main'
  register: apt_source_added

- name: Update cache
  apt: update_cache=yes
  when: apt_source_added|changed

- name: Install pkg
  apt: name=example

This ensures that the (time consuming) update only happens the first time, when the new repo is added. It also makes it much clearer what is taking the time, if the build hangs.

EDIT: I completely forgot that it’s possible to add the update_cache attribute directly to the apt_repository call. Much simpler!

- name: Add apt repository
  apt_repository: repo='deb https://example.com/apt/example jessie main' update_cache=yes

- name: Install pkg
  apt: name=example

Ansible & systemctl daemon-reload

(UPDATE: there is now a PR open to add this functionality)

We recently migrated to Debian 8 which, by default, uses systemd. I can appreciate why some people have misgivings about it, but from my point of view it’s been a massive improvement.

Our unit files look like this now:

[Service]
ExecStart=/var/www/{{ app_name }}/app.js
Restart=always
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier={{ app_name }}
User={{ app_name }}
Group={{ app_name }}
Environment=NODE_ENV={{ env }}
WorkingDirectory=/var/www/{{ app_name }}

[Install]
WantedBy=multi-user.target

Compared to a 3 page init script, using start-stop-daemon. And we no longer need a watchdog like monit.

We do our deployments using ansible, which already knows how to play nice with systemd. One thing missing though, is that if you change a unit file you need to call systemctl daemon-reload before the changes will be picked up.

There’s a discussion underway as to whether ansible should take care of it. But for now, the easiest thing to do is add another handler:

- name: Install unit file
  sudo: true
  copy: src=foo.service dest=/lib/systemd/system/ owner=root mode=644
  notify:
    - reload systemd
    - restart foo

with a handler like this:

- name: reload systemd
  sudo: yes
  command: systemctl daemon-reload

UPDATE: if you need to restart the service later in the same play, you can flush the handlers to ensure daemon-reload has been called:

- meta: flush_handlers

Debugging an ansible module

Debugging an ansible module can be a pretty thankless task; luckily the team has provided some tools to make it a little easier. While it’s possible to attach a debugger (e.g. epdb), good old fashioned println debugging is normally enough.

If you just add a print statement to the module, and run it normally, then you’ll be disappointed to see that your output is nowhere to be seen. This is due to the way that ansible modules communicate, using stdin & stdout.

The secret is to run your module using the test-module script provided by the ansible team. You can then pass in the arguments to your module as a json blob:

hacking/test-module -m library/cloud/rax -a "{ \"credentials\": \"~/.rackspace_cloud_credentials\", \"region\": \"LON\", \"name\": \"app-prod-LON-%02d\", \"count\": \"2\", \"exact_count\": \"yes\", \"group\": \"app_servers\", \"flavor\": \"performance1-1\", \"image\": \"11b0cefc-d4ec-4f09-9ff6-f842ca97987c\", \"state\": \"present\", \"wait\": \"yes\", \"wait_timeout\": \"900\" }"

The output of this script can be a bit verbose, so if you’re only interested in your output it can be worthwhile commenting out the code that prints out the parsed output, and just keeping the raw version.

Adding instances to multiple host groups using the Ansible rax module

We use the Ansible rax module to create new instances of our “cloud servers”. It’s pretty easy to add them to one group:

- name: Build a Cloud Server
  tasks:
      local_action:
        module: rax
        name: rax-test1
        wait: yes
        state: present
        networks:
          - private
          - public
        group: app-servers

But it’s also quite handy to be able to place a server in multiple groups (e.g. test / production, different regions etc). There’s nothing in the documentation about this, but a bit of code spelunking reveals that a metadata key named “groups” can contain a comma-separated list of extra host groups:

- name: Build a Cloud Server
  tasks:
      local_action:
        module: rax
        name: rax-test1
        wait: yes
        state: present
        networks:
          - private
          - public
        group: app-servers
        meta:
            groups: test, london