How to create a Cloudwatch Alarm in AWS CDK

avatar
Borislav Hadzhiev

Last updated: Jan 27, 2024
7 min

banner

# CloudWatch Alarms Introduction

AWS Services emit metrics that we can use to set up alarms via CloudWatch.

For example, metrics can be:

  • ConcurrentExecutions, Duration, Errors for a Lambda function.
  • CPUUtilization, DiskReadOps, DiskWriteOps for EC2 instances.
  • ConsumedReadCapacityUnits, ConsumedWriteCapacityUnits, ThrottledRequests for DynamoDB.

Every AWS service has documentation on the list of the metrics that are available by default:

Once we have created the metrics we want to track, we can create an alarm.

The purpose of an alarm in CloudWatch is to notify us when the metrics we've set reach specific values, over a specified period of time.

For instance, we can create an alarm that notifies us:

  • if the sum of Errors of a lambda function is greater than or equal to 5 for a period of 3 minutes

  • if the average Duration time of a lambda function's invocation exceeds 2 seconds over a period of 3 minutes

  • if the sum of throttled Dynamodb requests exceeds 3 over a period of 5 minutes

# Creating Alarms in AWS CDK

We are going to create a small CDK application that consists of the following resources:

  • lambda function
  • metric that tracks the number of function invocation errors
  • metric that tracks how many times our Lambda function was invoked
  • an alarm that triggers if the SUM of errors for Lambda function invocations is greater than or equal to 1 over a period of 1 minute
  • an alarm that triggers if the SUM of Lambda invocations is greater than or equal to 1 over a period of 1 minute
The code for this article is available on GitHub

Let's start by defining the Lambda function and the metrics.

lib/cdk-starter-stack.ts
import * as cloudwatch from 'aws-cdk-lib/aws-cloudwatch'; import * as lambda from 'aws-cdk-lib/aws-lambda'; import * as cdk from 'aws-cdk-lib'; import * as path from 'path'; export class MyCdkStack extends cdk.Stack { constructor(scope: cdk.App, id: string, props: cdk.StackProps) { super(scope, id, props); // ๐Ÿ‘‡ lambda function definition const myFunction = new lambda.Function(this, 'my-function', { runtime: lambda.Runtime.NODEJS_18_X, memorySize: 1024, timeout: cdk.Duration.seconds(5), handler: 'index.main', code: lambda.Code.fromAsset(path.join(__dirname, '/../src/my-lambda')), }); // ๐Ÿ‘‡ define a metric for lambda errors const functionErrors = myFunction.metricErrors({ period: cdk.Duration.minutes(1), }); // ๐Ÿ‘‡ define a metric for lambda invocations const functionInvocation = myFunction.metricInvocations({ period: cdk.Duration.minutes(1), }); } }
If you still use CDK version 1, switch to the cdk-v1 branch in the GitHub repository.

The code for the Lambda function could be as simple as follows.

src/my-lambda/index.js
async function main(event) { throw new Error('An unexpected error occurred'); } module.exports = {main};

In the code sample we:

  1. Created a Lambda function.
  2. Created 2 metrics, by using the metricErrors and metricInvocations methods exposed by the Lambda Function construct.

The higher-level constructs often expose methods that allow us to create metric objects, without having to manually instantiate the Metric class from the CloudWatch module.

For example, if we were working with a DynamoDB table, we could take advantage of methods like:

Next, let's add the alarms that will be triggered when our metrics reach a specified threshold.

lib/cdk-starter-stack.ts
import * as cloudwatch from 'aws-cdk-lib/aws-cloudwatch'; import * as lambda from 'aws-cdk-lib/aws-lambda'; import * as cdk from 'aws-cdk-lib'; import * as path from 'path'; export class MyCdkStack extends cdk.Stack { constructor(scope: cdk.App, id: string, props: cdk.StackProps) { super(scope, id, props); // ... rest of the code // ๐Ÿ‘‡ create an Alarm using the Alarm construct new cloudwatch.Alarm(this, 'lambda-errors-alarm', { metric: functionErrors, threshold: 1, comparisonOperator: cloudwatch.ComparisonOperator.GREATER_THAN_OR_EQUAL_TO_THRESHOLD, evaluationPeriods: 1, alarmDescription: 'Alarm if the SUM of Errors is greater than or equal to the threshold (1) for 1 evaluation period', }); // ๐Ÿ‘‡ create an Alarm directly on the Metric functionInvocation.createAlarm(this, 'lambda-invocation-alarm', { threshold: 1, evaluationPeriods: 1, alarmDescription: 'Alarm if the SUM of Lambda invocations is greater than or equal to the threshold (1) for 1 evaluation period', }); } }

We created a CloudWatch alarm using the Level 2 Alarm construct.

For the metric, we used the functionErrors metric we created earlier.

The threshold is the value against which the statistic emitted by the metric is compared. For example, in our case, the number of error invocations of a lambda function will be compared against a threshold of 1.

The comparisonOperator is the operator we're using to compare the threshold against the statistic emitted by the metric. In our case, if the SUM of invocation errors from the Lambda function is GREATER_THAN_OR_EQUAL_TO the threshold of 1 over 1 evaluation period of 1 minute, the alarm will be triggered.

The evaluationPeriods property is the number of consecutive periods, over which the threshold is compared to the statistic emitted by the metric. In our case, we set the period property to 1 Minute when we created our metrics, so we'll be comparing the threshold and the metric's statistic for 1 evaluation period of 1 minute.

In order to create the second alarm we invoked the createAlarm method, directly on the metric. Defining alarms in CDK can be done in multiple ways, so it's a matter of personal preference.

Let's now create the stack and look at the result:

shell
npx aws-cdk deploy

If I open my CloudFormation console I can see that the resources were created successfully:

stack created

If I open my CloudWatch console I can see that the alarms are at the Insufficient data state:

cloudwatch alarms insufficient

Metric Alarms can be in 1 of 3 states:

  • OK - the metric is within the specified threshold
  • ALARM - the metric is outside the specified threshold
  • INSUFFICIENT_DATA - the alarm has just started or there isn't enough data available to determine the alarm's state.

We've set up 2 alarms:

  • One to watch for invocation errors in our lambda function
  • One to watch for function invocations

Let's invoke our lambda function using the console and look at the result.

lambda-failed

As expected our lambda function failed, which means that if we look at the state of our alarms we should see that they have been triggered.

in alarm state

# Discussion

In order to create Alarms in CDK we have to first define a metric and then create our CloudWatch alarm.

We only created metrics by using the metric* methods exposed on the Function construct, i.e.:

const functionErrors = myFunction.metricErrors({ period: cdk.Duration.minutes(1), });

However, some of the higher-level constructs might not expose a helper method for all of the metrics we need to create.

In this case, we can define our metrics by using the Metric class, for example:

// ๐Ÿ‘‡ manually instantiate a Metric const concurrentExecutions = new cloudwatch.Metric({ namespace: 'AWS/Lambda', metricName: 'ConcurrentExecutions', period: cdk.Duration.minutes(5), statistic: 'Maximum', dimensions: { FunctionName: myFunction.functionName, }, });

In the code sample:

  1. We created a metric in the namespace AWS/Lambda. A namespace denotes an AWS Service and the names are in the form of AWS/ServiceName, for example AWS/DynamoDB, AWS/ApiGateway.

    As the letter casing can be confusing, you can view the specific casing of a service's name by clicking on the Metrics section in your CloudWatch console and filtering by the name of the service.

metrics dashboard

  1. The metricName property is set to ConcurrentExecutions. We can find all of the available metrics for an AWS service simply by googling for ServiceName cloudwatch metrics. Lambda-specific CloudWatch metrics can be found at Lambda function metrics

  2. The period property is the period over which the specified statistic is applied. In our case, we're tracking concurrent executions for a Lambda functions over a period of 5 minutes.

  3. The statistic property is an aggregate of metric data over a specified period of time. The statistic can be: Minimum, Maximum, Sum, Average, etc. In our case, we're taking the Maximum number of concurrent Lambda function executions over a period of 5 minutes.

  4. The dimensions property allows us to filter the results that CloudWatch returns. In the example, we get statistics for a specific lambda function, because we've set the FunctionName property.

# Conclusion

To create CloudWatch alarms, we first define our metrics and then create an alarm that compares a threshold we've set, to the statistic emitted by a metric over a period of time.

Most of the time we are able to use predefined helper methods, already written for us by the CDK team, by using the metric* methods on the construct.

const functionInvocation = myFunction.metricInvocations({ period: cdk.Duration.minutes(1), });

In the case that the helper method for the metric we need is not implemented, we can manually create a metric using the Metric class.

There are also multiple ways to create CloudWatch Alarms in CDK.

We can either create an alarm, using the Alarm construct:

new cloudwatch.Alarm(this, 'lambda-errors-alarm', { metric: functionErrors, threshold: 1, comparisonOperator: cloudwatch.ComparisonOperator.GREATER_THAN_OR_EQUAL_TO_THRESHOLD, evaluationPeriods: 1, alarmDescription: 'Alarm if the SUM of Errors is greater than or equal to the threshold (1) for 1 evaluation period', });

Or we can create an alarm using the createAlarm method directly on the metric object:

functionInvocation.createAlarm(this, 'lambda-invocation-alarm', { threshold: 1, evaluationPeriods: 1, alarmDescription: 'Alarm if the SUM of Lambda invocations is greater than or equal to the threshold (1) for 1 evaluation period', });

# Clean up

We can delete the provisioned resources by running the cdk destroy command:

shell
npx aws-cdk destroy

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev