Managing microservice secrets effectively on AWS Fargate

7 min readApr 1, 2019

The Amazon Elastic Container Service makes it easy to deploy Docker-based applications as a set of services accessible over a public network. Fargate is a service on top of Amazon ECS that makes it easy to manage compute resources without having to explicitly create clusters of EC2 nodes. Like with any such service, security becomes a key concern. Being able to serve secure connections using TLS, to authenticate users, and to identify itself securely to other services are all essential for securely running such services. This means that the application must be able to access various kinds of secrets such as private keys, certificates, passwords, tokens, and access keys. A naive approach to solving these problems involves making all of these secrets part of the container image itself. This makes it extremely insecure and inconvenient too, for the following reason:

For each environment, you have to build a different container image because the secrets it contains must be specific to that environment. Not being able to use a common image is very inconvenient.
Anybody who has access to the image can easily access the secrets embedded in the container’s file system. For certain things like public keys, this might not be a concern. But this is just too insecure for a true secret — database passwords, public cloud access keys, private keys, etc.

In this article we shall look at an effective strategy to address these concerns while building and deploying such container-based services. We will be using AWS CLI so make sure you have the latest version of AWS CLI installed and configured (see the Appendix at the end of the article for more details).

Deploying a microservice on Fargate

In this section, we shall deploy a simple service that does not use secrets in anyway. In the next section, we will change our deployment specs to use secrets. In order to create a service on ECS from a container image, you must perform the following steps:

Create an ECS cluster.
Create Task Definitions to be used for deploying your service.
Create and start the Task using the Task Definition.

Creating an ECS cluster

Use the following command to create a new ECS cluster if you do not already have one available. Make sure CLUS_NAME is set to the name of the cluster you want to create, and AWS_REGION is set to the correct region identifier (e.g. us-west-2 for US West, Oregon).

aws ecs create-cluster --cluster-name ${CLUS_NAME} --region=${AWS_REGION}

Creating a Task Definition

The best way to create a reusable task definition is to create a task definition JSON like the one shown below:

{
    "family": "nginxdemos-hello",
    "networkMode": "awsvpc",
    "containerDefinitions": [
        {
            "name": "nginxdemos-hello",
            "image": "nginxdemos/hello:latest",
            "portMappings": [
                {
                    "containerPort": 80,
                    "hostPort": 3333,
                    "protocol": "tcp"
                }
            ],
            "essential": true
        }
    ],
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "125",
    "memory": "256"
}

The above JSON defines a task template for ECS. It is identified by its task family nginxdemos-hello. It deploys a single container called nginxdemos-hello, using the nginxdemos/hello:latest image. The container listens for incoming http requests on port 80, which are mapped to the host port 3333. We request 125 millicores (12.5% of a core) worth of CPU, and 256 MiB of memory for it. To create the task definition using this JSON, save this in a file called (saytaskdef.json), then run the following command from the same directory where the taskdef.json file is present:

aws ecs register-task-definition --cli-input-json file://./taskdef.json --region ${AWS_REGION}

Note that the path to the JSON task spec is a file URI. If you run the command from a different path, pass the correct file URI corresponding to the absolute path to the taskdef.json file. This would create a task definition that you can then use to deploy a running task.

Deploying a Task using a task definition

To run the task deployed above, you need to run the following command after substituting appropriate values for the variables referenced:

aws ecs run-task --cluster ${CLUS_NAME} --task-definition \ 
  ${TASK_FAMILY}:${TASK_REV} --count 1 --launch-type FARGATE \
  --network-configuration "awsvcpConfiguration={subnets=[${SUBNET}],assignPublicIp=ENABLED}" \
  --region=${AWS_REGION}

In the above, TASK_FAMILY should be the value of the family attribute in your task definition. In our example, that is nginxdemos-hello.
The TASK_REV is the revision number for the task. The first time a task definition with a particular family is created, its revision number is 1. If you create newer task definitions referring to the same task family, then upon each new iteration, the task revision is bumped by 1. You can check the available task definitions including their revision numbers with:
aws ecs list-task-definitions --family-prefix nginxdemo-hello --region=${AWS_REGION}
SUBNET is a string containing the identifier for a single subnet or a comma-separated list of subnet ids, each of which identifies a subnet that is part of your VPC. For example, it could look like subnet-57fffd31 or subnet-57fffd31,subnet-71d2af7e. You can get a quick listing of subnets accessible to you using:
aws ec2 describe-subnets --region=${AWS_REGION}
Once the task is launched, you can check its status using the following command:
aws ecs list-tasks --region ${AWS_REGION}You can note the task uuid from the output, set TASK_UUID to it, and run the following command to get more details:

aws ecs describe-tasks --tasks \
  "arn:aws:ecs:${AWS_REGION}:${AWS_ACCOUNT_ID}:task/${TASK_UUID}" \
  --cluster ${CLUS_NAME} --region ${AWS_REGION}

Currently, the public IP of the task is not displayed in the CLI output and you will need to use the AWS ECS UI to figure out its value.

Using secrets securely and effectively in the microservice

Now let us assume that we need to access a database password in our application. We have already seen above good reasons for this password to not be part of the container image. What are our options?

The right way to use secrets with ECS / Fargate is to assign each secret a key or identifier, and then add these key-secret pairs to the AWS Systems Manager Parameter Store. By using a value type of SecureString, the secret values would be encrypted, by default using a common internal key which you can override. For our example, we would use this default key.
In order for the set of containers to access these secrets and decrypt them, they must have certain IAM roles assigned.
Task definitions must reference the secrets so that they are available to the tasks as environment variables.

Adding secrets to AWS Systems Manager Parameter Store

To add a secret, use the following command:

aws ssm put-parameter --name "${DB_SECRET_KEY}" \
  --type SecureString --value "${DB_PASSWD}" \
  --overwrite --region ${AWS_REGION}

You should set DB_SECRET_KEY to any unique identifier meaningful for your purpose. It is often set to a hierarchical path such as /container/myapp/secrets/dbpass. Of course, DB_PASSWORD contains your database password.

Allowing containers access to secrets

By default the AmazonECSTaskExecutionRolePolicy is created and available for an ECS account. It contains the ecsTaskExecutionRole which is required by the container agent to run tasks. In order for the container to be able to access secrets stored as SecureString parameters in AWS System Manager Parameter Store, as described earlier, additional IAM permissions must be made available to the containers. One way to do it is to add these permissions as part of an inline policy to ecsTaskExecutionRole. You can do this using the JSON spec and command described below.

Save the following JSON content in a file called fargate-app-secrets.json after replacing the referenced variables appropriately:

{
  "Version": "2012-10-17",
  "Statement": [
    {   
      "Effect": "Allow",
      "Action": [
        "ssm:GetParameters",
        "kms:Decrypt"
      ],  
      "Resource": [
        "arn:aws:ssm:${AWS_REGION}:${AWS_ACCOUNT_ID}:parameter/${DB_SECRET_KEY}",
        "arn:aws:ssm:${AWS_REGION}:${AWS_ACCOUNT_ID}:parameter/${SECRET_KEY1}",
        ...
        "arn:aws:kms:${AWS_REGION}:${AWS_ACCOUNT_ID}:key/alias/aws/ssm"
      ]   
    }
  ]
}

Note that in the Resource array, we have listed the various secrets. In addition the last entry in the list is a key called /alias/aws/ssm. This line grants access to the default key used to encrypt the secrets themselves. Now patch ecsTaskExecutionRole with this inline policy.

aws iam put-role-policy --role-name ecsTaskExecutionRole \
  --policy-name FargateSecretsAccessor --policy-document \
  file://./fargate-app-secrets.json --region ${AWS_REGION}

Referencing secrets in task definitions

Finally, you must update your task definitions so that the secret values are made available to your container via environment variables.

In the task definition JSON, inside each entry under the containerDefinitions attribute, you must specify the secrets to expose to the container and the environment variable to map the secret to. You do this thus:

{
    "family": "nginxdemos-hello",
    "networkMode": "awsvpc",
    "containerDefinitions": [
        {
            "name": "nginxdemos-hello",
            "image": "nginxdemos/hello:latest",
            "portMappings": [
                {
                    ...
                }
            ],
            "essential": true,
            "secrets": [
                 {
                     "name": "ENV_VAR1_NAME",
                     "value_from": "arn:aws:ssm:${AWS_REGION}:${AWS_ACCOUNT_ID}:parameter/${SECRET_KEY1}"
                 },
                 {
                     "name": "ENV_VAR2_NAME",
                     "value_from": "arn:aws:ssm:${AWS_REGION}:${AWS_ACCOUNT_ID}:parameter/${SECRET_KEY2}"
                 }
            ]
        }
    ],
    ...
}

The rest of the process is identical to that of creating tasks without secrets. If you now create a task using the above task definition, your task processes can access the secrets via the environment variables listed in the secrets section above.

Most of the AWS CLI commands print JSON content on the standard output on receiving success or error response. Using a command-line JSON processing tool like jq, it is quite easy to parse useful values from this output and write simple automation scripts.

Summary

The process that builds your container images should never have to deal with secrets. This means that there should be no secrets in any of repositories — public or private, nor any secrets inside your container images.

Instead, secrets should be entirely be handled during the deployment of your service. Thus an installation script should read secret values, either interactively from a user or from a secure input file, each time the service is deployed.

Appendix: Getting the latest AWS CLI

The secrets attribute of a container entry under containerDefinitions was not recognized by the schema validation logic in AWS CLI until very recently. Therefore you need to get very recent version of the AWS CLI (March 2019). To do this run the following commands:

sudo apt install python3-pip
sudo pip3 install --upgrade awscli

If you already have an older version of awscli installed via a native package, you may first uninstall it using sudo apt remove awscli or the equivalent.