AWS CLOUD DEVELOPMENT KIT REVIEW

Today, in IT, Infrastructure as Code (IaC) is a requirement. It’s not only a “nice to have” anymore, we cannot continue with installation procedures in a timestamped Word document that is taking time and is error prone when we have better things to do. And in AWS we have all the tools we need to do so. There are no more excuses to not use them. From the low-level AWS REST API to the more high-level Serverless Application Model (SAM), we have different tools that fit different needs. And AWS Cloudformation templates are probably the elements in a GIT repository representing the most the IaC.

Cloudformation is nice, very nice. Until the day, we have 20+ templates with thousands of lines each. The most extreme case is probably the template used to define a VPC and its Subnets. We’ve only wanted a VPC, with 5 subnets in 3 AZs and we ended up spending 2 hours copying and pasting things with a resulting template hard to read. We can all agree, there are most rewarding things, and we are far from the DRY (Don’t Repeat Yourself) philosophy.

Cloudformation is an invaluable tool, but we need more flexibility, less copy/paste, less typing, variables, simple conditions, loops, etc. in other words we need a programming language. And the idea is not new, Troposphere (Python), SparkleFormation (Ruby), etc. are tools based on this very principle, using a programming language to generate Cloudformation templates. While they remove the redundancy, they basically only add one layer on top of Cloudformation.

And this is where the AWS Cloud Development Kit (CDK) comes in!

All the source code presented in this article is available in a Github repository.

What is CDK?

Unlike new Services that are announced only once available, AWS open source projects (Amplify Framework, IDE plugins, etc.) are revealed in preview version, which was the case for the CDK too. We’ve waited for almost a year but in July 2019 the version 1.0.0 has been released.

AWS CDK could be summarized as: “yet another Cloudformation generator”, but it would be a misleading and incomplete statement. AWS CDK aims to do much more. In a few lines of code, we can:

  • create a VPC and its subnets or even;
  • package a lambda, upload it to S3 and create it through Cloudformation (like Serverless)

And all of this without seeing a single line of a Cloudformation template.

AWS CDK comprises a Toolkit (a CLI application) and multiple modules (comparable to the SDK modules/libraries) that we’ll use to create our Infrastructure as Code. Both are written in TypeScript and downloadable from NPM. The modules can be used with JavaScript and TypeScript, but not only. Through the JSII engine we can use them with Python, Java, CSharp and FSharp too. AWS CDK source code is available on Github.

But before going any further, let’s see how JSII works.

JSII

JSII is a JavaScript engine created by AWS to use JavaScript modules in other languages (the one cited above). We’ll use Java as an example. We first need to use JSII to “convert” the JavaScript module into a JAR. In this phase, JSII parses the JavaScript module and create Java classes, methods, etc. corresponding to the JavaScript counterparts. But instead of fully transpiling the JavaScript to Java, the content of the Java methods are just delegates which have only one purpose when executed: generating JavaScript code (yes, it’s weird). At the end we will have a JAR containing all the transpiled classes with delegate methods, even a generated Maven pom.xml, and a dependency to the JSII runtime JAR. And this is where things start to be interesting. The JSII runtime is packaged with a JavaScript server. So, when using JSII through our Java classes:

  1. a new process is spawned: new ProcessBuilder(“node”, “jsii-runtime.js”).start();
  2. the Java “delegate” methods generate the JavaScript code;
  3. the code is sent to the server; and
  4. the server returns the result of the computation. Performance wise it’s horrible, but that doesn’t matter here. This is a smart way to write the code once in JavaScript and run it everywhere easily and from different programming languages.

On the other side, I can understand disappointed people who thought that like the SDKs, “supported languages” will mean “no dependency to another one”. Moreover, the CDK toolkit is only available trough NPM packages.

AWS CDK Getting Started

AWS CDK having already an introductory tutorial, we’ll take another approach, with examples using JavaScript. We will start by creating a new project with two dependencies: @aws-cdk/core (CDK core module) and @aws-cdk/aws-ec2 using Yarn.

1
2
3
4
$ mkdir <myproject> && cd <myproject>
$ yarn init -y
$ yarn add -E @aws-cdk/core@1.8.0
$ yarn add -E @aws-cdk/aws-ec2@1.8.0

On a side note, AWS CDK is using Lerna to have a multi-module project. All modules start with @aws-cdk/ and come from a mono-repository.

Then we create a new file named app_1.js with the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
const fs = require('fs')
const cdk = require('@aws-cdk/core')
const ec2 = require("@aws-cdk/aws-ec2")

class SimpleStack extends cdk.Stack  {
    constructor(scope, id, props) {
      super(scope, id, props)
      new ec2.Vpc(this, 'VPC')
  }
}

const app = new cdk.App()
new SimpleStack(app, 'SimpleStack')
const cloudAssembly = app.synth()
const files = fs.readdirSync(cloudAssembly.directory)
console.log(files)

First, we import the two modules we have previously installed. Then we define a class SimpleStack inheriting cdk.Stack, which we can guess, will allow us to define a Cloudformation Stack. Here we only want a VPC.

The Vpc constructor has the following parameters:

  • The first one is of type Construct (the Root type for all resources). Usually cdk.App, the scope to which the resource belongs to;
  • The second is the name of the resource in the generated Cloudformation template and;
  • The third is an optional object. We will play with it next.

These three parameters can be found in all Constructs in AWS CDK.

Finally, the last four lines make sense, only when we understand how AWS CDK works. So, for now let’s put the AWS CDK Toolkit aside and let’s dig a bit deeper.

1
2
$ node app_1.js
[ 'SimpleStack.template.json', 'cdk.out', 'manifest.json' ]

To see the content of these three files we can add the following snippet at the end of our script:

1
2
3
4
for(const file of files) {
  console.log(`File: ${file}`)
  console.log(fs.readFileSync(path.join(cloudAssembly.directory, file), 'utf8'))
}

The resulting files can be seen in github:

So, what’s interesting here? Most likely a file named *.template.json attract our attention! This is the generated Cloudformation template in JSON.

In a single line of code (new ec2.Vpc (this, 'VPC')), we’ve created:

  • 1 VPC
  • 6 Subnets (3 public and 3 private)
  • 1 Internet Gateway
  • 3 NAT Gateways (one in each public subnet for each private subnet)
  • All the necessary network routing (Routes and RouteTables and SubnetRouteTableAssociations)

All Cloudformation Resources name (the Logical ID) start with “VPC” (the value we’ve provided as the second parameter of our constructor) and is concatenated with multiple values depending on the resource, ending with a 8-character MD5 hash (the last 8 characters). That’s a good start.

manifest.json is not that interesting for an introduction to AWS CDK and cdk.out contains only the version of the CX API (Cloud Executable) used.

Using CDK Toolkit

Until now we’ve executed our app_1.js script using node to see how things work, but the right way is to use the AWS CDK Toolkit. In order to do so, we need to install it (locally to the project which is a good practice to avoid any version conflicts in the future):

1
$ yarn add -D -E aws-cdk@1.8.0

To check the installation:

1
$ yarn cdk --version

Before using the AWS CDK Toolkit, we are going to create a new file: app_2.js which is the same as app_1.js without the console lines. To print the Cloudformation template in the console we are going to use the command synthesize:

1
2
3
$ yarn cdk --app "node app_2.js" synthesize
# or its alias 'synth'
$ yarn cdk –a "node app_2.js" synth

A new folder (cdk.out) is created, containing the 3 files we’ve seen before: SimpleStack.template.jsoncdk.out and manifest.json. But this time the synthesize command generates a Cloudformation template in YAML and print it to the console. The result (app_2.cfn.yml) is exactly the same as what we add previously (SimpleStack.template.json) but this time in YAML. We can have the output in JSON by using the parameter --json. Internally, the CDK will spawn a new child_process using the value of the --app option and this.synth() is called into the App class constructor.

A Customized VPC Network

Until now we’ve written only one line of meaningful code but it would be nice if we could tweak things a little bit. Fortunately, the optional object parameter of the class Vpc gives us the ability to have more control. This time, we’d like to have:

  • only 2 AZs;
  • only one NAT Gateway; and
  • 1 more private Subnet isolated from the Internet (in and out)

app_3.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
new ec2.Vpc (this, 'VPC', {
            cidr: '10.128.0.0/16',
            natGateways: 1,
            maxAZs: 2,
            subnetConfiguration: [
                {
                  cidrMask: 24,
                  name: 'Web',
                  subnetType: ec2.SubnetType.PUBLIC
                },
                {
                  cidrMask: 24,
                  name: 'Application',
                  subnetType: ec2.SubnetType.PRIVATE
                },
                {
                  cidrMask: 24,
                  name: 'Database',
                  subnetType: ec2.SubnetType.ISOLATED
                }
            ]
        })

Everything is self-explanatory and this is impressive. In the end we can execute once again (changing the file name):

1
cdk --app "node app_3.js" synth

We now have exactly what we wanted (app_3.cfn.yml in a few lines of code. Unfortunately, we won’t be able to have more fine-grained configuration (eg. the CIDRs of each subnets).

A new Stack

Now we’d like to have a new Stack to create an ALB, and we will need informations from the previous stack like the VpcId and some subnets. First, we need to install the ALB module:

1
$ yarn add -E @aws-cdk/aws-elasticloadbalancingv2@1.8.0

Then we can create the file app_4.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
const cdk = require('@aws-cdk/cdk');
const ec2 = require('@aws-cdk/aws-ec2');
const elbv2 = require('@aws-cdk/aws-elasticloadbalancingv2');

class VpcStack extends cdk.Stack {
    constructor(scope, id, props) {
        super(scope, id, props);
        // [...]
    }
}

class AlbStack extends cdk.Stack {
  constructor(scope, id, vpcStack, props) {
      super(scope, id, props)

      new elbv2.ApplicationLoadBalancer(this, 'LB', {
          vpc: vpcStack.vpc,
          internetFacing: true
      })
  }
}

const app = new cdk.App();
const vpcStack = new VpcStack(app, 'VpcStack');
new AlbStack(app, 'AlbStack', vpcStack);

The first thing we can do is list the stacks:

1
$ yarn cdk -a "node app_4.js" list

Which displays:

1
2
VpcStack
AlbStack

Then we can execute:

1
$ yarn cdk -a "node app_4.js" synth

This time nothing is printed in the console, both templates are generated in the folder cdk.out in JSON (app_4.VpcStack.cfn.json & app_4.AlbStack.cfn.json). By indicating the stack name VpcStack in the command we have the template in YAML:

1
$ yarn cdk --app "node app_4.js" synth VpcStack

Unfortunately, this works only if the stack doesn’t have dependencies on another one:

1
$ yarn cdk --app "node app_4.js" synth AlbStack

The result is the same as the command without the stack name.

As we can see in the generated templates, by passing the VpcStack to the AlbStack and then using vpc: vpcStack.vpc, the CDK generates for us all the export/import necessary values.

Cfn Classes

Often (at least for now) the high-level abstraction classes are not highly configurable and sometimes some properties don’t even exist.

Fortunately, all Cloudformation resources have their counterpart classes, prefixed by Cfn (e.g. CfnVPCCfnLoadBalancerCfnBudget, etc.).

But we must be aware that if we use one Cfn class, all resources in a Stack must be Cfn classes. We cannot mix the high-level and low-level abstractions. On one side we get full control of our resources but on another we will have to describe all the properties as we would have done it using Cloudformation, and we will still have to refer to the Cloudformation documentation, as the actual AWS CDK API documentation is simply unusable.

As an example, we will create a Budget resource (without the boilerplate code), only available through a Cfn class.

1
$ yarn add -E @aws-cdk/aws-budgets@1.8.0

app_5.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
new budgets.CfnBudget(this, 'Budget', {
    budget: {
        budgetLimit: {
            amount: 1000,
            unit: 'USD'
        },
        budgetName: 'MyBudget',
        budgetType: 'COST',
        timeUnit: 'MONTHLY',
        costFilters: {
            Service: [
                'Amazon Elastic Compute Cloud - Compute',
                'Amazon Elastic Block Store'
            ],
            TagKeyValue: [
                'user:Application$MyApp'
            ]
        }
    },
    notificationsWithSubscribers: [{
        notification: {
            comparisonOperator: 'GREATER_THAN',
            notificationType: 'ACTUAL',
            threshold: 80,
            thresholdType: 'PERCENTAGE'
        },
        subscribers: [{
            subscriptionType: 'EMAIL',
            address: 'my@email.com'
        }]
    }]
});
1
$ yarn cdk --app "node app_5.js" synth

As we can see (app_5.cfn.yml), this time, there is no added value compared to a YAML Cloudformation Template. On the contrary, we need brackets, braces and quotes. We don’t even have enumerations, which would have been a nice touch. And this is the result with a dynamically typed language. Imagine what would have been the result with C# or Java.

Of course, in some case we can use programming languages features, for example if we decided to build a VPC with Cfn classes, we could have methods to create a Subnet, a RouteTables, etc. and then call them inside a loop.

Even if this article focuses on the synthesize command of the CDK, deploy and destroy are the ones we will use the most. These commands are doing the actual work in our AWS accounts by creating/updating and deleting stacks. But if we are a little paranoid then we will want to check what is generated before deploying our stacks. And we will probably want to commit the cdk.out folder to be able to check what has been modified between each modification. Too bad that the generated templates are in JSON and not YAML though (at least for now).

Conclusion

Adding a layer (CDK), on top of a layer (Cloudformation), on top of another layer (AWS REST API) is usually a bad idea in term of new features propagation. We are all aware of at least one missing feature from Cloudformation that need to be handled using a Custom Resource. With the AWS CDK the infrastructure becomes an integral part of our application. As we’ve seen it offers a lot of nice features.

Unfortunately, in this wonderful world we can notice some problems with the high-level API:

  • Adding an abstraction layer on top of Cloudformation means that we will have to wait months to have new AWS services/features implemented. First, we will have to wait the Cloudformation implementation and then the CDK one;
  • The abstract layer introduces a risk in creating new resources that we didn’t want. Our first example shows exactly that. We ask for a VPC and the CDK created a lot of things, including NATs which can be quite expensive for small organizations or even people new to AWS. And in the end, I’m afraid that we will have to check thoroughly the generated Cloudformation to avoid any surprises;
  • The high-level API makes the AWS CDK a very opinionated tool, with too little flexibility;
  • The high-level API is hard to use, even with the reference documentation (which is still in its infancy), we have nice examples for common use case but once we want to dig a little deeper, we are facing a list of thousands of classes, properties and methods, mixing the high and low level API;
  • The team seems quite small with a handful of regular contributors and more than 700 opened issues today (December 2019);

In the end AWS CDK is a good idea, not new, but it goes a little further than other tools. Unfortunately, there is still a lot of work to be done to make it the go-to for IaC.

Software development Business intelligence Infrastructure Digital trust Mobile developent