Terraforming an RDS Proxy

We have been using pgbouncer as a connection pooler, and have been very happy with it; but some recent load testing has shown that it will need to grow with throughput. And it is also a single point of failure, unless you build some sort of load balanced cluster.

While that is possible, RDS Proxy offers the dream of autoscaling & resilience delivered to your door. We looked into it, when first released, and you had to use IAM authentication; but now native auth is an option.

I was following the getting started guide, but I wanted to use tf rather than the cli.

First, I needed a security group, allowing access to the pg port:

resource "aws_security_group" "rds_proxy_sg" {
    name = "rds-proxy-sg"
    vpc_id = var.vpc_id

    ingress {
        cidr_blocks = [ 
            var.aws_cidr
        ]   
        from_port = 5432
        to_port = 5432
        protocol = "tcp"
    }
    egress {
        from_port = 0 
        to_port  = 0 
        protocol = "-1"
        cidr_blocks = [ "0.0.0.0/0" ]
    }
}

And a role/policy to allow the proxy to get creds from secrets manager:

resource "aws_iam_role" "rds_proxy_role" {
  name = "RdsProxySecrets"

  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "rds.amazonaws.com"
      },
      "Effect": "Allow"
    }
  ]
}
EOF
}

resource "aws_iam_policy" "rds_proxy_policy" {
  name        = "RdsProxySecrets"
  path        = "/"

  policy = <<EOF
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "secretsmanager:GetSecretValue",
            "Resource": [
                "arn:aws:secretsmanager:eu-west-2:***:secret:*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": "kms:Decrypt",
            "Resource": "arn:aws:kms:eu-west-2:****:key/***",
            "Condition": {
                "StringEquals": {
                    "kms:ViaService": "secretsmanager.eu-west-2.amazonaws.com"
                }
            }
        }
    ]
}
EOF
}

resource "aws_iam_policy_attachment" "rds_proxy_policy_attachment" {
  name       = "RdsProxySecrets"
  roles      = [aws_iam_role.rds_proxy_role.name]
  policy_arn = aws_iam_policy.rds_proxy_policy.arn
}

I don’t have any other secrets in there, so I used a wildcard instead of listing them all individually.

Then the proxy itself:

resource "aws_db_proxy" "rds_proxy" {
    name                   = "rds-proxy"
    debug_logging          = false
    engine_family          = "POSTGRESQL"
    idle_client_timeout    = 1800
    require_tls            = true 
    role_arn               = aws_iam_role.rds_proxy_role.arn
    vpc_security_group_ids = [aws_security_group.rds_proxy_sg.id]
    vpc_subnet_ids         = var.private_subnets
    ...

which needs the role arn, the sg id, and the subnets (output from a dependency, in my case).

And you also need the auth block, for each login:

    auth {
        username = "foo"
        secret_arn = "arn:aws:secretsmanager:eu-west-2:***:secret:foo"
        iam_auth   = "DISABLED"
        client_password_auth_type = "POSTGRES_SCRAM_SHA_256"
    }

Unfortunately, when I ran this, I got an error:

An error occurred (InvalidParameterValue) when calling the CreateDBProxy operation: UserName must not be set in UserAuthConfig when SECRETS AuthScheme is used.

Given that “SECRETS” is the only auth scheme available, this was a bit confusing; but the username is included in the secret json (more on this later), so I guess it is not needed. With that property removed, the proxy could be created.

You also need a target, i.e. which db to call:

resource "aws_db_proxy_target" "rds_proxy_target" {
    db_instance_identifier = var.rds_instance
    db_proxy_name          = aws_db_proxy.rds_proxy.name
    target_group_name      = "default"
}

I am relying on the “default” target group being named that, rather than creating a new one.

At this point, I thought I was golden, but when I tried to connect:

$ psql -h rds-proxy.proxy-***.eu-west-2.rds.amazonaws.com -d foo -U foo
psql: error: FATAL:  This RDS proxy has no credentials for the role foo. Check the credentials for this role and try again.

I was disappoint.

I turned on “enhanced logging”, and had a look in cloudwatch:

Credentials couldn't be retrieved. The AWS Secrets Manager secret with the ARN "arn:aws:secretsmanager:eu-west-2:***:secret:foo" has an incorrect secret format or has empty username. Format the secret like this: 
{
    "username": "",
    "password": ""
}

I could see the creds in Secrets Manager, and it looked like the correct format:

{'username': 'foo', 'password': '...'}

but clearly something was wrong:

I had created the secrets using ansible, because the db passwords were in group vars (this creates a chicken & egg problem with the tf, but I’ll ignore that for now):

- name: Create creds in secrets manager
  local_action:
    module: community.aws.secretsmanager_secret
    name: "creds-{{ item }}"
    state: present
    secret_type: 'string'
    secret: >
      {"username":"{{ item }}","password":"{{ hostvars[groups['pgbouncer'][0]][item | regex_replace('-', '_') + '_db_password'] }}"}
  loop: "{{ db_accounts }}"

but it seems that I needed to use json_secret instead.

$ psql -h rds-proxy.proxy-***.eu-west-2.rds.amazonaws.com -d foo -U foo
Password for user foo: 
psql (13.11 (Debian 13.11-0+deb11u1), server 13.10)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

foo=> 

Success! 🌈

Now I need to throw some locusts at it, and see if this was worth the effort.

Leave a comment