“Trouble parsing json”

We use Bunyan┬áin our node apps, for “structured logging”. The output json string is passed to syslog, by systemd, and then fed into ELK.

{
    "name":"foo-service",
    "hostname":"app-01",
    "pid":30988,
    "ip":"1.19.24.8",
    "requestId":"1c11f448-73f2-4efa-bc63-3de787618d49",
    "level":50,
    "err": {
        "message":"oh noes!"
    }
}

Unfortunately, if that string is longer than 2048 chars (usually a stacktrace, or html returned from a web service instead of json), then the json blob ends up split over 2 lines in syslog.

This causes ELK to barf when attempting to parse the broken lines (assuming you are parsing as json), and means you won’t see those errors in Kibana.

It is possible to detect the error parsing the error, by searching for the string “Trouble parsing json”, but that’s not really a solution.

I would prefer to see a truncated error, than have the current situation, but that means either wrapping or patching Bunyan itself.

Using bunyan with express

I’ve pontificated previously about the benefits of combining bunyan with logstash, but if your app is using express it can be a little more complicated.

I wanted to add the user id & IP address of a request to all log output involved in that request. Bunyan makes it simple to create a child logger, and add the necessary properties; but due to the way express is architected (and how node works) the only way to access it is to pass it to the handler, and all it’s dependencies:

app.post("/foo", function(req, res) {
    var childLogger = logger.child({
            userId: req.userId, // taken from a header by middleware
            ip: req.ip
        }),
        barService = new BarService(childLogger),
        handler = new Handler(barService, childLogger);
    handler.handle(req, res);
});

Which makes it possible to search the logs for all information relating to a specific user, or IP address.

Multiple configuration files for LogStash

If you have multiple applications running on a box with the logstash client, you probably want to split the config file up a bit. According to this thread, you can use a directory as the value for the config input:

/opt/logstash/bin/logstash agent --config /etc/logstash.d --log /var/log/logstash/logstash.log

and the files in that directory will be conflated in lexicographical order.

This means you need some sort of scheme to keep your inputs before your outputs, etc. I’m not too bothered about the order of the individual inputs, so I prefixed each category with a different number:

$ ls -la /etc/logstash.d
total 32
drwxr-xr-x  2 logstash logstash 4096 Apr 14 13:42 .
drwxr-xr-x 98 root     root     4096 Apr 16 01:28 ..
-rwxr-xr-x  1 logstash logstash  140 Jan  8 13:34 1_foo_inputs.conf
-rwxr-xr-x  1 logstash logstash  122 Jan  8 12:55 1_bar_inputs.conf
-rwxr-xr-x  1 logstash logstash   99 Jan  8 12:06 1_syslog_inputs.conf
-rwxr-xr-x  1 logstash logstash  237 Jan  8 13:34 2_foo_filters.conf
-rwxr-xr-x  1 logstash logstash  229 Jan  8 12:55 2_bar_filters.conf
-rwxr-xr-x  1 logstash logstash  170 Apr 14 13:42 3_output.conf

ensuring that inputs will be before filters, and both will be before outputs.

This means that the logstash config for each app can be deployed with it, without needing to amend a shared file.

LogStash & Bunyan, a match made in heaven

Every platform has a plethora of logging libraries; each with their own good points, and bad points. The first node.js library we tried was winston. It had all the usual features: multiple logging levels, different outputs (we actually just use stdout, and the daemon pipes it to a logfile); but in production we wanted to use logstash to aggregate and parse the logs.

Unfortunately winston suffers from a common problem: exception stack traces are split across multiple lines. It should be possible to use the multiline codec to merge the lines, but around this time I ran across bunyan.

The manifesto from Joyent explains the merits of logging in json (or some other machine parseable format), and it’s very easy to configure a logstash file input for a json file just by setting the codec:

input {
    file {
        type => "foo-app"
        path => [ "/var/log/foo-app/*.log" ]
        codec => "json"
    }   
}

Now, when your logs are shipped to the logstash server, each field will be indexed individually without needing to grok any crazy patterns.

It’s also very simple to add fields to the logger at runtime:

var logger = bunyan.createLogger({
    name: name,
    stream: process.stdout,
    level: 'info',
    anExtraField: 'foo'
});

...

logger.fields.anotherExtraField = 'bar';

which will also be searchable in logstash; making it easy to trace a request across multiple services, for example.

UPDATED: to add some sample output

{"name":"foo","hostname":"pikachu","pid":7645,"level":30,"msg":"Listening on port 1234","time":"2014-08-12T16:14:19.667Z","v":0}
{"name":"foo","hostname":"pikachu","pid":7645,"level":30,"msg":"Service started","time":"2014-08-12T16:14:19.673Z","v":0}
{"name":"foo","hostname":"pikachu","pid":7645,"level":30,"msg":"Received request for abc","time":"2014-08-12T16:32:28.600Z","v":0}
{"name":"foo","hostname":"pikachu","pid":7645,"level":50,"msg":"ERROR: bad request","time":"2014-08-12T16:32:28.600Z","v":0}

As the levels are integers (e.g. 30 => INFO), you can ask logstash for all entries where level >= ? (e.g. all ERRORs & FATALs).

Logging unhandled exceptions with Nancy

Normally when developing an ASP.NET application I add fall back logging for any unhandled exceptions in Global.asax.cs:

public class Global : HttpApplication
{
    private ILog _logger;
    
    protected void Application_Start(object sender, EventArgs e)
    {
        XmlConfigurator.Configure();
        _logger = LogManager.GetLogger(typeof(Global));
    }

    protected void Application_Error(object sender, EventArgs e)
    {
        var error = Server.GetLastError();
        _logger.Error("Unhandled error", error);
    }
}

This means any YSODs are logged.

When developing applications with Nancy this no longer works, as Nancy has it’s own error handling pipeline.

The solution is to implement a handler to log any unhandled errors:

public class LoggingErrorHandler : IErrorHandler
{
    private readonly ILog _logger = LogManager.GetLogger(typeof(LoggingErrorHandler));

    public bool HandlesStatusCode(HttpStatusCode statusCode)
    {
        return statusCode == HttpStatusCode.InternalServerError;
    }

    public void Handle(HttpStatusCode statusCode, NancyContext context)
    {
        object errorObject;
        context.Items.TryGetValue(NancyEngine.ERROR_EXCEPTION, out errorObject);
        var error = errorObject as Exception;

        _logger.Error("Unhandled error", error);
    }
}

And, as of 0.10.0, there’s no longer any need to register it. Any IErrorHandler implementations will be picked up, auto-magically.