Managing nodejs dependencies

The easiest way to “deploy” a node app is to clone the git repo on a server, and run npm install. There are a couple of disadvantages though: first, I don’t really like having to install git, and manage credentials for a private repo.

Second, installing the dependencies like that means you may get different versions of the modules you rely on than you were expecting. One of the tenets of a reliable build pipeline is ensuring that builds are repeatable, and that what you deploy matches what you tested.

There are a few alternatives: you could vendor in the node_modules folder, but this dirties the commit history, and increases the size of your repo. You could use npm shrinkwrap, which is the same concept as Bundler’s Gemfile.lock, a list of specific versions to install. This is definitely an improvement, but still leaves the risk of npm i failing during a deployment.

I’d prefer to only install the dependencies once, on the build server. This means I can run the tests, then tar up the folder and upload that to each environment in turn:

npm install
npm test
npm prune --production
npm dedupe
tar -czf app.tar.gz --exclude='*/coverage' app/

After running the tests, we prune the dependencies to remove mocha etc. We then dedupe to try and reduce the number of copies of shared modules, and finally create an archive of the app. This is output as an artifact of the build, and pulled in the by the deploy to staging build, output from that again, and finally pulled in by the deploy to production build.

Arrow functions returning an expression

One of the new features of ES6 that has made it into Node 4 is arrow functions. According to the documentation, there are 2 alternative syntaxes:

(param1, param2, …, paramN) => { statements }
(param1, param2, …, paramN) => expression
         // equivalent to:  => { return expression; }

Using the 2nd version works as expected for primitive values:

> [1,2,3].map(i => i);
[ 1, 2, 3 ]

but not when you return an object:

> [1,2,3].map(i => { id: i });
[ undefined, undefined, undefined ]

As was pointed out to me on SO, it’s impossible to tell the difference between a statement block and an expression consisting of an object literal.

The solution is to wrap the returned object in an extra set of parentheses:

> [1,2,3].map(i => ({ id: i }));
[ {id: 1}, {id: 2}, {id: 3} ]

Zero downtime deployments with node cluster

The easiest way to do zero downtime deployments is using multiple nodes behind a load balancer. Once removed from the rotation, you can fiddle with them to your heart’s content.

If you only have one box to play with, things aren’t so simple. One option is to push the complexity up to whatever you’re using to orchestrate deployments. You could do something like a blue/green deployment, with two full sets of processes behind nginx as a load balancer; but this felt liable to be very fragile.

I next started looking at a process manager, like pm2; but it seemed to offer far too many features I didn’t need, and didn’t play that well with systemd. It was inspiration though, for just going direct to the nodejs cluster API. It’s been available for some time, and is now marked as “stable”.

It allows you to run multiple node processes that share a port, which is also useful if you are running on hardware with multiple CPUs. Using the cluster API allows us to recycle the processes when the code has changed, without dropping any requests:

var cluster = require('cluster');

module.exports = function(start) {
    var env = process.env.NODE_ENV || 'local';
    var cwd = process.cwd();

    var numWorkers = process.env.NUM_WORKERS || 1;

    if (cluster.isMaster) {
        fork();

        cluster.on('exit', function(worker, code, signal) {
            // one for all, let systemd deal with it
            if (code !== 0) {
                process.exit(code);
            }
        });

        process.on('SIGHUP', function() {
            // need to chdir, if the old directory was deleted
            process.chdir(cwd);
            var oldWorkers = Object.keys(cluster.workers);
            // wait until at least one new worker is listening
            // before terminating the old workers
            cluster.once('listening', function(worker, address) {
                kill(oldWorkers);
            });
            fork();
        });
    } else {
        start(env);
    }

    function fork() {
        for (var i = 0; i < numWorkers; i++) {
            cluster.fork();
        }
    }

    function kill(workers) {
        if (workers.length) {
            var id = workers.pop();
            var worker = cluster.workers[id];
            worker.send('shutdown');
            worker.disconnect();
            kill(workers);
        }
    }
};

During a deployment, we simply replace the old code and send a signal to the “master” process (kill -HUP $PID); which causes it to spin up X new workers. As soon as one of those is ready and listening, it terminates the old workers.

Converting complex js objects to xml

The xml node package offers “fast and simple Javascript-based XML generation”.

I had hoped that it would be as simple as:

var xml = require('xml');

var result = xml({
    FOO: {
        BAR: "something"
    }
});

But that only included the top level element:

<FOO />

After some RTFM, and a few false starts, it became clear that you need to represent each child node as an array of key value pairs:

xml({
    FOO: [
        {_attrs: { abra: "cadabra" },
        { BAR: "something" },
        { BAZ: "else" }
    ]
});

Which should result in XML like this:

<FOO abra="cadabra">
    <BAR>something</BAR>
    <BAZ>else</BAZ>
</FOO>

Using bunyan with express

I’ve pontificated previously about the benefits of combining bunyan with logstash, but if your app is using express it can be a little more complicated.

I wanted to add the user id & IP address of a request to all log output involved in that request. Bunyan makes it simple to create a child logger, and add the necessary properties; but due to the way express is architected (and how node works) the only way to access it is to pass it to the handler, and all it’s dependencies:

app.post("/foo", function(req, res) {
    var childLogger = logger.child({
            userId: req.userId, // taken from a header by middleware
            ip: req.ip
        }),
        barService = new BarService(childLogger),
        handler = new Handler(barService, childLogger);
    handler.handle(req, res);
});

Which makes it possible to search the logs for all information relating to a specific user, or IP address.

Using promises with node-pg

While it is possible to use node-pg with just raw callbacks, as soon as you want to use transactions the chances of getting it right dwindle rapidly.

Using a promise library helps remove some of the pain. I ended up with something like this (using Q):

var Q = require('q'),
    pg = require('pg'),
    Connection = require('./Connection');

module.exports = function(connString) {
    this.connect = function() {
        var deferred = Q.defer();

        pg.connect(connString, function(err, client, done) {
            if (err) {
                return deferred.reject(new Error(err));
            }

            deferred.resolve(new Connection(client, done));
        });

        return deferred.promise;
    };
};
var Q = require('q'),
    pg = require('pg');

module.exports = function(client, done) {
    this.begin = function() {
        return this.query('BEGIN;', []);
    };

    this.query = function(sql, args) {
        var deferred = Q.defer();

        client.query(sql, args, function(err, result) {
            if (err) {
                err.message = err.message + ' (Query: ' + sql + ', Args: ' + args + ')';
                return deferred.reject(err);
            }

            deferred.resolve(result.rows);
        });

        return deferred.promise;
    };

    this.rollback = function() {
        var deferred = Q.defer();

        client.query('ROLLBACK', function(err) {
            deferred.resolve();
            return done(err);
        });

        return deferred.promise;
    };

    this.commit = function() {
        var deferred = Q.defer();

        client.query('COMMIT', function(err) {
            if (err) {
                deferred.reject(err);
            } else {
                deferred.resolve();
            }
            return done(err);
        });

        return deferred.promise;
    };

    this.close = function() {
        done();
    };
}

Which you can use like this:

var db = new Db(connString);

return db.connect().then(function(conn) {
    return db.begin().then(function() {
        return conn.query('SELECT something FROM foo WHERE id = $1;', [id]).then(function(rows) {
            return conn.query('INSERT INTO ...', [...]);
        }).then(function() {
            return conn.commit();
        });
    }).fin(conn.close);
});

And remove at least some of the risk of failing to return the connection to the pool, or otherwise bungling your error handling.

Using Bootstrap alerts with Express.js 4

It’s often handy to be able to display a notification to users, noting a successful operation for example, or showing errors.

With Express this is normally done using a “flash message”. You’ll need the connect-flash middleware, as it was unbundled in Express 3.

var express = require('express'),
    flash = require('connect-flash'),
    app = express();

app.use(flash());
app.use(function(req, res, next){
    res.locals.success = req.flash('success');
    res.locals.errors = req.flash('error');
    next();
});

The 2nd piece of middleware ensures that flash messages will be available to the template as locals.

router.get('/account/name', function (req, res) {
        var data = {
            firstName: req.user.firstName,
            lastName: req.user.lastName
        };
        res.render('settings_name', data);
    });

    router.post('/account/name', function (req, res) {
        req.user.updateName(req.user.id, req.body.firstName, req.body.lastName, function(err) {
            if (err) {
                req.flash('error', 'Could not update your name, please contact our support team');
            } else {
                req.flash('success', 'Your name was updated');
            }
            res.redirect('/account/name');
        });
    });

The final step is to display the messages when necessary. Bootstrap alerts provide an easy way to do this:

doctype html
html
  head
    link(rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/css/bootstrap.min.css")
  body
    if error && error.length > 0
      .alert.alert-danger.alert-dismissible.fade-in(role="alert")
        button.close(type="button" data-dismiss="alert" aria-label="Close")
          span(aria-hidden="true") ×
        p= error
    if success && success.length > 0
      .alert.alert-success.alert-dismissible.fade-in(role="alert")
        button.close(type="button" data-dismiss="alert" aria-label="Close")
          span(aria-hidden="true") ×
        p= success

    script(src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js")
    script(src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/js/bootstrap.min.js")

The promise handler

Continuing from my last adventures with promises and express.js, it turns out I can have what I want. Sort of. Behold the promise handler:

module.exports = function(inner) {
    this.handle = function(req, res) {
        res.promise(inner.handle(req));
    };
};

So now my express handlers look like this:

module.exports = function() {
    this.handle = function(req) {
        return somethingThatReturnsAPromise();
    };
};

And you would wire it all together like this:

var app = express();
app.use(promiseMiddleware());
app.get("/foo", function(req, res) {
    var handler = new PromiseHandler(new FooHandler());
    return handler.handle(req, res);
});

(As a side-note, I’m a huge fan of the Russian doll model of composing behaviour of handlers)

Using promises with Express.js

I was looking for a way to combine promises with Express. I found a few suggestions, like this middleware, but nothing that really fit what I wanted.

Ideally, I’d like to be able to just return a promise from a handler:

app.get("/foo", function(req, res) {
    return getSomethingAsync();
});

but I couldn’t see any way to achieve that without hacking on express itself. The best I could come up with was some middleware to add a method to the response:

module.exports = function() {
    return function(req, res, next) {
        res.promise = function(promise) {
            promise.then(function(result) {
                res.send(200, result);
            }).fail(function(err) {
                if (err.statusCode) {
                    res.send(err.statusCode, { error: err.message });
                } else {
                    res.send(500, { error: 'Unexpected error' });
                }
            }).done()
        };
        next();
    };
};

which can be used like this:

app.get("/bar", function(req, res) {
    res.promise(getSomethingAsync());
});

Not the promised land

Promises are a much touted solution to callback hell in nodejs. They can certainly help to clean up your code, taking you from this:

doSomething(function(err, res) {
    if (err) {
        return handleError(err);
    }

    doSomethingElse(res, function(err, res2) {
        if (err) {
            return handleError(err);
        }

        andThen();
    });
});

to this:

doSomething().then(function(res) {
    return doSomethingElse(res);
}).then(function() {
    andThen();
}).fail(handleError);

A definite improvement! However, there is a subtle difference between using callbacks and using promises: an unhandled error in the callback code will blow up the event loop, but can be swallowed by the promises (specifically, if you don’t provide a fail handler, or call done).

I want my software to fail fast (“let it crash”), rather than limping onwards. Unfortunately, this means that you need to be very careful when using promises, to ensure that you have covered all the possible cases. And it means you need a good understanding of how they work before starting to use them; not exactly the pit of success.

So, do the readability benefits outweigh the consequences? I think so, for now, but there’s definitely room for improvement. Maybe generators will be that silver bullet :)